You are not logged in.
@ontobelli. Ok. Added to latest version.
Version 0.10.6 (13.08.2012)
*spruced up default blocklists entries in rc.conf.
*fixed small issue in hostsblock-urlcheck.
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
Hi all. I've made a new homepage for hostsblock. Check it out: http://www.personal.psu.edu/jav209/scripting.html
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
As far as I can see, the commands used in gaenserich's script for processing the files downloaded as well as black and white lists results in a shorter list of domains blocked.
I used the same black and white lists (including about 20-30 entries each) and the following lists:
http://hosts-file.net/ad_servers.asp
http://hostsfile.mine.nu/Hosts
http://www.hostsfile.org/Downloads/hosts.txt
http://pgl.yoyo.org/as/serverlist.php?h … =plaintext
http://someonewhocares.org/hosts/hosts
http://sysctl.org/cameleon/hosts
http://winhelp2002.mvps.org/hosts.txt
And ontobelli's script gave me a list of 127,361 domains while gaenserich's scripts listed 118,650 domains.
Offline
I haven't used ontobelli's script yet. Assuming that they are both well sorted, could you perhaps diff the difference to see what the deal is?
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
I tried both meld and diff but they don't seem to work properly with these files (due to their size?)
They can't even compare line by line.
Offline
Apparently my problem was caused by the fact that one had space and the other had tab as delimiter.
I'm not experienced in these, but Meld offered to save the differences in patch format, which you can find here:
http://minus.com/lbqPrrHUpCvjjm
I hope it helps...
Offline
Assuming hosts.block is from hostsblock, I did a quick scan to see if what was missing from your hosts.block was also missing from mine. I only got three results:
$ for LINE in `grep "^+" hosts.block\ -\ hosts.patch | sed "s/^+//g"`; do newline=`echo $LINE | sed "s|\.|\\\.|g"` ; if grep -E "[[:space:]]$newline" /etc/hosts.block &>/dev/null; then /bin/true; else echo "$LINE"; fi; done
adnet.com.tr
adtext.adnet.com.tr
www.adnet.com.tr
The three missing items might just be from a delayed update. This tells me at least that this probably isn't a sorting issue with hostsblock, but your cached blocklists aren't up to date somehow. If you delete all the files under "/var/cache/hostsblock" and then re-run, do you get different results?
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
These 3 are from my own blacklist.
So I tested again, this time deleting the entire "/var/cache/hostsblock" (and also using no black or whitelist entries) and compared the resulting files: no difference this time -- except that one has "0.0.0.0 0.0.0.0" and the other has "0.0.0.0 127.0.0.1" which is trivial.
Thanks a lot...
Offline
A cosmetic suggestion:
- modifying date in echo -e "\nHostsblock started/completed at `date`\n" to $(date +"%x %R") (or even removing these 2 lines) and
- adding to the end of the script:
echo -e "Hostsblock updated $number domains on $(date +"%x %R")" >> "/etc/hostsblock/hostsblock.log"
Offline
Hi.
In general the script is working fine. But sometimes there are errors with the downloaded files that are not detected and the final hosts file is incomplete. The script should check time and size of the files, and the error status in case curl support it.
I don't see the need to gzip the old hosts-file. Could it be optional or even removed?
Offline
@sadi:
I tweaked the date/time format to your suggestion, but with %R changed to %T so we can still see seconds
As for removing those lines, I think it wiser to perhaps develop a verbosity setting for that (patches eagerly accepted!). I now have integrated logging included, so whatever gets pushed out to stdout and stderr will get redirected to your logfile, if you want.
@ontobelli
I tweaked curl to have a longer timeout (60 seconds), plus for the whole script to fail if curl gives an error code. I also removed the added gzip'ing of the backup...one less dependency to worry about.
Version 0.11 (18.08.2012)
*fixed typo in hostsblock-urlcheck
*added integrated logging
*removed gzipping of target hostsfile backup, removed gzip as dependency.
*added extended timeout to downloading procedure, provided for script exit in the event of failed download
*tweaked the format of dates and times in log output
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
Testing ALL lists and working fine until now.
hosts file 38.1 MiB and 1,343,000 lines.
dnsmasq 120 MiB in RAM
Uncached DNS requests aprox 200 ms
Cached DNS request 5 ms
Better than expected!
Last edited by ontobelli (2012-08-19 09:18:44)
Offline
I did something wrong last time.
Hostsblock started at 08/19/2012 05:44:39
Checking blocklists for updates...
http://hosts-file.net/download/hosts.zip...no changes
http://hosts-file.net/ad_servers.asp...no changes
http://hosts-file.net/hphosts-partial.asp...no changes
http://hostsfile.mine.nu/Hosts.zip...no changes
http://hostsfile.org/Downloads/BadHosts.unx.zip...no changes
http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&mimetype=plaintext...no changes
http://someonewhocares.org/hosts/hosts...no changes
http://sysctl.org/cameleon/hosts...no changes
http://winhelp2002.mvps.org/hosts.zip...no changes
http://www.ismeh.com/HOSTS...UPDATED
http://www.malwaredomainlist.com/hostslist/hosts.txt...no changes
http://abp.mozilla-hispano.org/nauscopio/hosts.zip...no changes
http://rlwpx.free.fr/WPFF/htrc.7z...no changes
http://rlwpx.free.fr/WPFF/hpub.7z...no changes
http://rlwpx.free.fr/WPFF/hrsk.7z...no changes
http://rlwpx.free.fr/WPFF/hsex.7z...no changes
http://rlwpx.free.fr/WPFF/hmis.7z...no changes
DONE. Changes found.
Backing up /etc/hosts.block to /etc/hosts.block.old...done
Extracting and preparing cached files to working directory...
hosts-file.net.download.hosts.zip...extracting...extracted...prepared
hosts-file.net.ad_servers.asp...prepared
hosts-file.net.hphosts-partial.asp...prepared
hostsfile.mine.nu.Hosts.zip...extracting...extracted...prepared
hostsfile.org.Downloads.BadHosts.unx.zip...extracting...extracted...prepared
pgl.yoyo.org.as.serverlist.php.hostformat.hosts.mimetype.plaintext...prepared
someonewhocares.org.hosts.hosts...prepared
sysctl.org.cameleon.hosts...prepared
winhelp2002.mvps.org.hosts.zip...extracting...extracted...prepared
www.ismeh.com.HOSTS...prepared
www.malwaredomainlist.com.hostslist.hosts.txt...prepared
abp.mozilla-hispano.org.nauscopio.hosts.zip...extracting...extracted...prepared
rlwpx.free.fr.WPFF.htrc.7z...extracting...extracted...prepared
rlwpx.free.fr.WPFF.hpub.7z...extracting...extracted...prepared
rlwpx.free.fr.WPFF.hrsk.7z...extracting...extracted...prepared
rlwpx.free.fr.WPFF.hsex.7z...extracting...extracted...prepared
rlwpx.free.fr.WPFF.hmis.7z...extracting...extracted...prepared
Local blacklist...prepared
Local whitelist...prepared
DONE.
Processing files...done
[b]1829874 urls blocked[/b]
Running postprocessing...done
Hostsblock completed at 08/19/2012 05:45:41
hosts blocked ----> 1,829,874
hosts.block file ---> 51.8 MiB
dnsmasq (RAM) ---> 162 MiB
Processing time --> 62 seconds
Offline
Is it really useful to have such a huge block list?
I have a very nuisance-free internet connection with about 4MB blocklist.
I wonder if I'm missing something.
Offline
@sadi: I'm assuming ontobelli, like me, has a larger list that doesn't just block nuisances, but also domains that perpetuate malware and trackers. It's just an added security feature. On the other hand, when you have this many block entries, there is a bit of a trade off: the amount of false positives goes up (but is thankfully a lot easier to deal with with hostsblock-urlcheck).
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
Is it really useful to have such a huge block list?
I have a very nuisance-free internet connection with about 4MB blocklist.
I wonder if I'm missing something.
The optimal situation is just to have a small hosts file that block only ads, trackers, and malware of the sites you visit. But in that case you have to maintain manually that lists and is a time consuming task. Because I'm lazy to do that I preferred to automatize jobs as much as possible.
In my case I don't feel any performance penalty. DNSmasq is resolving in 2-3 ms cached IPs with 160 MiB in RAM. With a 4 GB laptop I don't have any problem with that. But with older machines I a have a small hosts file and I don't use dnsmasq, nether kwakd.
You can add a extra layer of protection using OpenDNS servers. Specially suited to less powerful machines as a replacement of hostsblock.
Just think what is your best cost/benefit ratio and adapt to that.
Last edited by ontobelli (2012-08-19 15:02:24)
Offline
@gaenserich:
Thanks for the new version. It worked without any problems here.
@gaenserich&ontobelli:
Thanks a lot for your replies. As I use a netbook with 2GB RAM, it seems a 4MB blocklist might be an optimum solution for me.
By the way, "Spybot Search & Destroy" (www.safer-networking.org) is also good on these matters.
I "steal" their list from Windows hosts file from time to time in addition to the main lists cited here.
Is there a better way of including the domains on their list?
Offline
After some testing i get to this reasonable list and 25% of the size respect to the last one.
'http://hosts-file.net/download/hosts.zip'
'http://hosts-file.net/ad_servers.asp'
'http://hosts-file.net/hphosts-partial.asp'
'http://hostsfile.mine.nu/Hosts.zip'
'http://hostsfile.org/Downloads/BadHosts.unx.zip'
'http://someonewhocares.org/hosts/hosts'
'http://sysctl.org/cameleon/hosts'
'http://winhelp2002.mvps.org/hosts.zip'
'http://www.ismeh.com/HOSTS'
'http://www.malwaredomainlist.com/hostslist/hosts.txt'
'http://abp.mozilla-hispano.org/nauscopio/hosts.zip'
'http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&mimetype=plaintext'
'http://rlwpx.free.fr/WPFF/htrc.7z'
'http://rlwpx.free.fr/WPFF/hpub.7z'
'http://rlwpx.free.fr/WPFF/hrsk.7z'
hosts blocked ----> 463,016
hosts.block file ---> 13.6 MiB
dnsmasq (RAM) ---> 41.5 MiB
Processing time --> 30 seconds
Last edited by ontobelli (2012-08-20 12:58:32)
Offline
Taking out all the blocklists from rlwpx.free.fr should save you a lot of space, if you are insterested. These lists are extremely aggressive. I personally like using an "all of the above" method: nearly all blocklists, adblocking software in my browser + ghostery, and opendns. None of these options seem to slow down my browsing noticably...in fact with dnsmasq and kwakd, elements that normally would have to be resolved and then blocked by browser adblock extensions get cut off from the outset, which saves a bit of bandwidth too.
winhelp2002.mvps.org.hosts.zip
pgl.yoyo.org.as.serverlist.php.hostformat.hosts.mimetype.plaintext
support.it-mate.co.uk.downloads.hphosts.zip
hosts-file.net.download.yahoo_servers.zip
hosts-file.net.hphosts-partial.asp...prepared
hostsfile.org.Downloads.BadHosts.unx.zip
www.securemecca.com.Downloads.hosts.txt
hostsfile.mine.nu.Hosts.zip...extracting...extracted
www.malwaredomainlist.com.hostslist.hosts.txt
someonewhocares.org.hosts.hosts
sysctl.org.cameleon.hosts
www.ismeh.com.HOSTS
abp.mozilla-hispano.org.nauscopio.hosts.zip
rlwpx.free.fr.WPFF.htrc.7z
rlwpx.free.fr.WPFF.hpub.7z
rlwpx.free.fr.WPFF.hrsk.7z
rlwpx.free.fr.WPFF.hmis.7z
949,951 urls blocked
28M /etc/hosts.block
dnsmasq RAM: ~86M
Total processing time (including download checks): 1:34
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
rlwpx.free.fr.WPFF.hmis.7z
That one is very big and blocks a lot of useful sites.
Offline
The latest list of domains regarded by "Spybot - Search & Destroy" as undesirable is here:
http://minus.com/ltWIiRBQ0C9X8
Perhaps it might also be posted on a server like others.
Safer Networking Limited claims copyright but I don't know if such an action would be a breach.
Offline
Ah, thanks.
For kwakd, the submitter said:
Note, ideally this should run as its own user/group (to restrict its privileges rather than running directly as root)
How do you do that?
kwakd maintainer here: that should be fixed now. The updated package grants
"CAP_NET_BIND_SERVICE" to the kwakd binary, which is ran using an unprivileged
user account.
gaenserich, could you host your script in a Github repository? That way, it's
easier for people to critique and contribute.
Offline
@ontobelli: Indeed, I'm sure most of the false positives originate from rlwpx.free.fr.WPFF.hmis.7z.
@sadi: If you can find a public url with a list, and this url regularly updates itself automatically, I'll include it. Ask the Spybot maintainers about it..
@tlvince: Thanks so much for the update. Also, following your advice, I have bowed to pressure (see next post)
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
Now available via github: https://github.com/gaenserich/hostsblock
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
A new homepage: http://gaenserich.github.com/hostsblock/
Check out hostsblock for system-wide ad- and malware-blocking.
Offline