Post by Sam GeeraertsOp Sun, 23 Apr 2017 19:53:31 +0200
Post by Paul BoddieWell, I rediscovered how to get access to the script and modified it
to be somewhat more general. It is attached to this message.
Am I right that this creates a list of users that have never edited?
Sorry, I should have provided some guidance! Effectively, you get a collection
of files, with the most important ones being these:
bad.txt should be a collection of users who never edited
bad-domains.txt should just summarise where the bad users came from (which
domains were used in their e-mail addresses)
The principal goal of this script is to prune users from the wiki who
registered in an attempt to spam the wiki and who managed to verify their e-
mail details, but who didn't get any further because of other measures in
place to prevent unwanted edits.
Although this might seem like an unlikely scenario, I imagine that it has
become more common to permit self-registration, mostly because that does save
administrators the effort of interacting with the account creation mechanisms,
but then to have explicit access control lists that grant specific users
editing privileges. Getting such privileges is a matter of interacting with
the community running the wiki and building up enough trust that an
administrator then updates the list with a new user's details, which is a
fairly simple action.
The other files are as follows:
accounts.txt should be a collection of e-mail address lines from account files
editors_migrated.txt are usernames of people whose content was imported into
the wiki (not really appropriate here, I guess, but you never know)
editors_wiki.txt are accounts of people who have edited the wiki
editors.txt combines the two sources of editors (using mapping.txt to map
usernames to identifiers)
Possibly, less interesting are the files that deal with people registering for
the wiki and having to verify their accounts over e-mail:
unverified.txt are accounts that still need to perform e-mail verification
verified.txt are accounts that do not need to perform e-mail verification
Generally, spammers are quite able to verify their accounts, so this is not a
sufficient measure to prevent spamming.
For your purposes, I would imagine that you would mostly be interested in just
removing spurious accounts, many of which could have been generated by
potential spammers. It might be interesting to monitor the rate of new account
creation. When the Mailman Wiki was migrated, we encountered such issues very
early on, but I haven't looked too closely at the situation since then.
So, running the script should just produce these files which you can then
inspect. If there are few bad users, things become a bit more involved in
terms of figuring out whether those users really can be removed - they could
be spammers that did manage to edit or they could be genuine editors - but
with many bad users, you would need to just "sanity check" the contents of the
file and see that the script really did do its job correctly.
I hope this is a bit more helpful!
Paul