You are not logged in.
Pages: 1
[This thread is a continuation of post #97 (and some following) from the Fighting spammers thread]
[paste of post #97]
Lately I've started orienting myself in the Recent Changes page, to try to help recognize spam or vandalism, and yesterday I noticed and had to undo a spamming edit submitted by a completely trustworthy user with quite a long story of useful contributions: he wasn't aware at all of that edit, so his account password had probably been discovered by a bot, or he did the edit from a windows client infected with a spamming worm...
What I'm trying to say is that the fight to spam and vandalism shouldn't stop at registration time, but we should discuss a way to promptly recognize spam/vandalistic edits using the recent changes page and evaluating article diffs: I think that we should organize a (multilanguage) team in which each user takes the responsibility of regularly checking a specific group of pages in the Recent Changes page: I think the grouping could simply be based on alphabetical criteria, for example, if the team was composed of 26 users, each could take a different letter and daily check all articles in the recent changes page that start with that letter. Of course, if the team was composed of more users, the articles could be assigned based on the first 2 letters and so on.
Checking for spam or vandalism should be very quick and easy, and based on human judgement, it shouldn't mean also checking for other kinds of problems, like typos, coding errors etc. which would take much more time.
The Arch wiki is not Wikipedia, the number of changes per day is not so high, so even checking 2 or 3 letters per user (in case there are a few users or we want to overlap users' checking areas) would be quite easy, though very useful to the community, especially as it progressively grows over time.A very nice thing to do in relation to such a system, would be to improve the filtering possibilities in the Recent Changes page, to let users filter changes by the article name: supporting regular expressions could become useful if we wanted to assign the pages also based on their language, which is usually appended at the end of the name. Even better would be the personalization of RSS/Atom feeds.
I believe this could be a great improvement with a little effort, it would just need a bit of organization, what do you think?
Last edited by kynikos (2011-02-13 21:30:38)
Offline
@ Pierre
- Well, about the compromised account, @ post #99 I've reported the undo edit and that user's contributions
- Can you do something about improving the Recent Changes page filters, like filtering changes by the article name, or even letting users personalize the Atom feed at least with query strings?
The best thing would be if each subscribed user could receive a daily feed with the changes of the articles under his "jurisdiction" and the diffs to check, with only one diff per article which shows all the changes it has undergone during the day.
As I'm trying to see if some people are interested in joining, I'm starting a subscription list here:
1. kynikos [example: articles from A to M]
2. karol [example: articles from N to Z]
If you're interested, just reply to the thread and write your wiki username (if different to the forum one): we'll discuss how to divide the articles as the thread evolves.
If this project gets a foothold, the list and related info could then be moved to an official wiki page.
Last edited by kynikos (2011-02-13 22:44:12)
Offline
@ karol
I assume here you are interested (otherwise I feel alone in the list ): if you're not, just reply to this thread.@ Pierre
- Can you do something about improving the Recent Changes page filters, like filtering changes by the article name, or even letting users personalize the Atom feed at least with query strings?
I'm interested. Don't count on me doing any major translations or correction of wiki articles but I can check all the edits. In newsbeuter it's as easy as pressing 'o' :-) I don't mind checking all the edits, so I don't need any scripting / sorting.
Offline
I'm interested. Don't count on me doing any major translations or correction of wiki articles but I can check all the edits.
Perfect! Subscribe to this thread as we'll decide how to organize ourselves later on. You're brave if you really check all the edits, but they're hundreds each day, and if contributions grow as new users register, things could go out of control and people like you, me or others could lose interest in doing it.
Just to make it clear once again: this project does not aim at translation or even mistake corrections! The only purpose would be to ensure a team of users who protect the wiki from:
- insertion of spam text, like ads or any other kind of useless junk, almost often automatically created;
- vandalism, like the deletion of entire pieces of articles, for pure destructive purposes;
- deletion of useful content in good faith.
These users should take the responsibility of checking the articles that have been assigned to them on a daily basis; then, if they want to check more deeply those or other articles they're interested in, it's a welcome thing, but it's beyond this team's purposes.
Because of the fact that the articles would be divided among the members, and because recognizing spam or vandalism is very quick and easy for a human, this task should take very little time.
Lastly, organizing this work would make sure that there aren't users checking the same articles, thus avoiding a waste of efforts.
Last edited by kynikos (2011-02-13 22:50:06)
Offline
The thing is how do you ensure that every member of the team is actually doing his work? I can get hit by a car of be otherwise unable to check the edits I'm supposed to be checking, so doubling the efforts (I mean two people checking the same things) has some merit to it.
- insertion of spam text, like ads or any other kind of useless junk, almost often automatically created;
- vandalism, like the deletion of entire pieces of articles, for pure destructive purposes;
- deletion of useful content in good faith.
The first one is easy, the next two depend on:
- technical knowledge - if I don't know a thing about KDE, how will I know if the edits are destructive or beneficial?
- the language the article is written in - I speak only Polish and English.
Minor edits in a language I do not speak are doable, but major reworking is not, I doubt even Google Translate would help. And you can kiss 'this task should take very little time' goodbye :-)
I can get by by comparing this edit to this one. This edit is pretty easy to figure out too.
Last edited by karol (2011-02-13 23:02:37)
Offline
I may have missed some of this discussion - and I think this is a fine idea - but I am already subscribed to every article that I have contributed to and so I see all of the changes anyway. Is what you are proposing in addition to this, or to complement it?
Offline
I may have missed some of this discussion - and I think this is a fine idea - but I am already subscribed to every article that I have contributed to and so I see all of the changes anyway. Is what you are proposing in addition to this, or to complement it?
If you contributed to an article it's more likely you have the knowledge to discern good edits from bad edits and quite obviously (well, apart from fixing typos) you speak the language well enough.
Is extending this (beyond BuyCheapDrugs-spam prevention) to all articles feasible - I don't know.
Last edited by karol (2011-02-13 23:00:49)
Offline
The thing is how do you ensure that every member of the team is actually doing his work? I can get hit by a car of be otherwise unable to check the edits I'm supposed to be checking, so doubling the efforts (I mean two people checking the same things) has some merit to it.
If you think well, you're depicting the current situation: without organization, like things are now, there's no way of knowing if all changes are being checked. First thing, you're right, we should smartly overlap the users' competence areas: for example, if there are 3 members, user1 could check from A to R, user2 could check from H to Z and user3 could check (from A to H) & (from R to Z). Second thing, members could be required to add a "done" tag next to their username in the list, in a special column refreshed everyday: if one day a "done" is missing from some users and a range of changes has remained officially unchecked, somebody else could compensate for it.
The first one is easy, the next two depend on:
- technical knowledge - if I don't know a thing about KDE, how will I know if the edits are destructive or beneficial?
If an entire section is deleted without a discussion or any other kind of written justification, then it's not the way of doing it anyway, neither if it's actually a beneficial deletion. If in doubt, it could be at least reported in the discussion page. It's a rare event anyway, this way would just make sure it doesn't remain unnoticed.
- the language the article is written in - I speak only Polish and English.
As I've already said, the division should be also language-related, so each member would have only articles in languages they understand: almost all articles in different languages than English have a "_(Language)" suffix, so they are easily classifiable.
Minor edits in a language I do not speak are doable, but major reworking is not, I doubt even Google Translate would help. And you can kiss 'this task should take very little time' goodbye :-)
I can get by by comparing this edit to this one. This edit is pretty easy to figure out too.
It should be clear at this point: I'm not trying to put together a maintainer group, editing articles, correcting typos or coding mistakes etc. is something one can do in addition to spam checking.
Last edited by kynikos (2011-02-14 00:03:31)
Offline
I may have missed some of this discussion - and I think this is a fine idea - but I am already subscribed to every article that I have contributed to and so I see all of the changes anyway. Is what you are proposing in addition to this, or to complement it?
I don't understand the difference between "in addition to this" and "to complement it", anyway each member of this project should:
A) do the 3 antispam tasks I've exposed before on the daily article changes assigned to him;
AND
B) do everything else he wants on all the articles he wants, like those he can have in his watchlist: this is not meant to be part of this project.
In the end, all this is meant to let people know that the articles are regularly checked against spam, so there's no need to waste energy in doing it by oneself in a disorganized way: if one wants to help fighting spam, he can subscribe to this list; otherwise, if he has time to spend on the wiki, he can concentrate on doing other jobs, like checking more deeply only the articles he's interested in.
Last edited by kynikos (2011-02-14 00:23:23)
Offline
- do everything else he wants on all the articles he wants, like those he can have in his watchlist: this is not meant to be part of this project.
Thanks that answers my question. I'll remain in the complementary camp.
Offline
For the record, I try my best to review all changes every day. I believe wiki spam is well-controlled for the most part, but more active maintenance is always appreciated. Specifically, it would be nice if we could trim down the list of Unwatched Pages.
M*cr*s*ft: Who needs quality when you have marketing?
Offline
For the record, I try my best to review all changes every day. I believe wiki spam is well-controlled for the most part, but more active maintenance is always appreciated. Specifically, it would be nice if we could trim down the list of Unwatched Pages.
Sorry:
Permission error
From ArchWiki
Jump to: navigation, search
The action you have requested is limited to users in the group: Administrators.
Return to Main Page.
Offline
Specifically, it would be nice if we could trim down the list of Unwatched Pages.
If you forward me the list/give me access, I would be keen to adopt a couple dozen...
Offline
Unfortunately, the list contains over 2000 pages (though some are in the User and Category namespaces). I would be more interested in matching Popular pages within the Unwatched list with a bit of SQL-fu and posting that...
In general, many non-English article translations are not watched/maintained. As someone who only understands English and a smidgen of French, it is more difficult to spot abuse on these pages -- this is where I could see an anti-spam team being useful. However, as Pierre already noted, we are encouraging the creation of separate localized wikis to address this issue.
M*cr*s*ft: Who needs quality when you have marketing?
Offline
Adopting unwatched pages is surely a good thing, but it's not a solution for regular spam/vandalism checking, which should be addressed in a more official, reliable and organized way.
An intermediate solution would be to highlight in some way the unwatched articles in the recent changes page, so that regular controllers could pay more attention to them.
Offline
If all the current versions of Archwiki articles are verified to be spam-free, then it would be enough to monitor recent changes, right?
Offline
@ karol
Eheh that's self-evident, it's a fundamental hypothesis of this project... Of course checking for already-existent spam insertions could be made only by users as they read articles: realistically it's impossible to think that all current articles are spam-free, but an organized antispam team could avoid almost all future spam additions, while old spam would be (slowly) trimmed away over time.
Offline
Do the current versions have only the first type of spam
- insertion of spam text, like ads or any other kind of useless junk, almost often automatically created;
or do you suggest checking some previous versions and comparing for
- vandalism, like the deletion of entire pieces of articles, for pure destructive purposes;
- deletion of useful content in good faith.
?
Would reading the article be enough or do we need to read the wiki code? Apart from comments, can some tags be (ob)used to hide something?
Offline
Do the current versions have only the first type of spam [...] or do you suggest checking some previous versions and comparing for [...]?
Large-scale checking for already-existent spam/vandalistic edits would require a completely different kind of organization and searching method, for sure it would need other instruments than simply the recent changes: it would be a much larger and harder mission, this team should systematically address only future attacks.
Would reading the article be enough or do we need to read the wiki code? Apart from comments, can some tags be (ob)used to hide something?
The diffs show directly the source code, not the processed text, so even if there were other elements that can hide content (like fake templates and so on) everytinhg would be visible.
Offline
karol wrote:Do the current versions have only the first type of spam [...] or do you suggest checking some previous versions and comparing for [...]?
Large-scale checking for already-existent spam/vandalistic edits would require a completely different kind of organization and searching method, for sure it would need other instruments than simply the recent changes: it would be a much larger and harder mission, this team should systematically address only future attacks.
Ah, I thought we were going to go through the already written articles too.
Would reading the article be enough or do we need to read the wiki code? Apart from comments, can some tags be (ob)used to hide something?
The diffs show directly the source code, not the processed text, so even if there were other elements that can hide content (like fake templates and so on) everytinhg would be visible.
OK, I think I finally got it :-)
Offline
Offline
Karol, it looks like nobody else is interested at the moment, maybe this project will become more useful in the future.
Anyway I'll stay subscribed to this thread, so every reply will be taken into consideration
I'm checking every wiki edit and so far I'm not overwhelmed by the amount of work, so I think I can keep it up. I'm more concerned with the 'bus factor'. 'Bus factor' is the number of people that have to be "hit by a bus" (i.e. unable/unwilling to work on the project anymore) for the project to fail. The lower the number, the more likely the lofty goal set by the project will not be met.
Offline
Pages: 1