You are not logged in.
https://bbs.archlinux.org/viewtopic.php?id=80204
@el mariachi you need to edit the host.local and add you host name. The same that is in /etc/rc.conf to solve the delay
@vwyodajl yes it's inspired in graysky's hosts_update but mine is better
- More lists,
- Only download files if updated.
- 0.0.0.0
- and different commands to proccess the texfile
@ewaller I think that more important than the age of a post is it's RELEVANCE in PRESENT time.
Hosts file is everyday relevant in Linux, Windows and maybe other OSes
With that attitude only discourage participation.
Last edited by ontobelli (2012-02-17 08:15:13)
Offline
@ewaller I think that more important than the age of a post is it's RELEVANCE in PRESENT time.
Right you are. This is the basis of our policy to not continue too old threads but to open a fresh, more visible one and link back to the old thread if relevant. Just to quote from our Forum Etiquette.
If you have something to add and judge that your information is related, but more up-to-date, start a new thread and link to the old if desired, but avoid duplicating effort by posting information already contained in the Arch wiki.
Thus forcing a new thread as done in this case is exactly what was needed.
BTW: The bracketed part of the thread title is not very useful. Remember, titles are there to find relevant information, not to rant. Please reword this asap.
To know or not to know ...
... the questions remain forever.
Offline
I haven't noticed any difference, but some of those entries are legitimate sites. hosts.hphosts blocks pastebin?
Offline
Thus forcing a new thread as done in this case is exactly what was needed.
I think that way is inefficient, many duplicated threats and more complicated for the reader. But is your forum.
ontobelli, damn what the size of the host file was getting big, 269823 entries for me. How does that effect the speed when browsing, in general?
I never notice a delay. Read from it takes a few milliseconds and most probably it's already in RAM the second time you need it.
You can chose which lists to download. In my case it works great to block ads and protect me from dangerous sites.
I haven't noticed any difference, but some of those entries are legitimate sites. hosts.hphosts blocks pastebin?
Yes, I have the same inconvenience with some sites. That's why I added the line.
# remove unneeded blocked sites
grep -Ev ' dl.dropbox.com| host_you_want_to_whitelist' "${TMPFILE}7" > "${TMPFILE}9"
Just modify it to your needs.
# remove unneeded blocked sites
grep -Ev ' dl.dropbox.com| pastebin.com| www.pastebin.com' "${TMPFILE}7" > "${TMPFILE}9"
Cheers.
Last edited by ontobelli (2012-02-17 08:42:06)
Offline
My awk fu says this is clearer
grep -v -i -f whitelist hosts.mvps hosts.hphosts hosts.partial hosts.adservers hosts.yoyo hosts.sysctl | \
awk '
/^127.0.0.1/ {
print "0.0.0.0", tolower($2)
}
' | sort | uniq
Do people not like awk?
How do you live?
Edit: also needs a cleanup if a comment immediately follows the url without a space separator, but I leave that as an exercise to the reader.
Last edited by fschiff (2012-02-17 15:41:32)
Offline
https://bbs.archlinux.org/viewtopic.php?id=80204
@el mariachi you need to edit the host.local and add you host name. The same that is in /etc/rc.conf to solve the delay
This is what I've done and it doesn't work
Offline
Do people not like awk?
I have no idea. I'm a complete noob with Linux and Unix tools.
But any improvement is welcome. Awk syntax looks simple and clear. I hope also faster than the original.
This is what I've done and it doesn't work
@el mariachi If you're sure that the /etc/hosts has in the first line "127.0.0.1 YOUR_HOST_NAME" then I don't know what the problem is. Maybe the problem is your DNS server.
Offline
Some improvements using pipes instead of tmp files.
Awk is not my friend yet, but I'm learning, to use it in future version.
#!/bin/bash
#
# Credit to Ontobelli for this script
# Version: 2012-02-18 07:35 UTC
#
# make hosts temporal directory
HOSTSDIR=~/.hostsupdate
mkdir -p "${HOSTSDIR}"
# set output file
OUTPUTFILE="/etc/hosts"
if [ ! -f "${HOSTSDIR}/hosts.local" ]; then
echo "You need to create "${HOSTSDIR}"/hosts.local containing the hosts you wish to keep!"
exit 0
fi
# download the mvps.org hosts file.
wget -c -O "${HOSTSDIR}/hosts.mvps" "http://winhelp2002.mvps.org/hosts.txt"
# download hpHOSTS
wget -c -O "${HOSTSDIR}/hosts.hphosts" "http://support.it-mate.co.uk/downloads/HOSTS.txt"
# download hpHOSTS Partial
wget -c -O "${HOSTSDIR}/hosts.partial" "http://hosts-file.net/hphosts-partial.asp"
# download hpHOSTS ad/tracking servers
wget -c -O "${HOSTSDIR}/hosts.adservers" "http://hosts-file.net/ad_servers.asp"
# download the pgl.yoyo.org hosts Peter Lowe - AdServers
wget -c -O "${HOSTSDIR}/hosts.yoyo" "http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext"
# download SysCtl Cameleon hosts
wget -c -O "${HOSTSDIR}/hosts.sysctl" "http://sysctl.org/cameleon/hosts"
# hosts header
cat "${HOSTSDIR}"/hosts.local > "${OUTPUTFILE}"
# hosts body
cat "${HOSTSDIR}/hosts.mvps" "${HOSTSDIR}/hosts.hphosts" \
"${HOSTSDIR}/hosts.partial" "${HOSTSDIR}/hosts.adservers" \
"${HOSTSDIR}/hosts.yoyo" "${HOSTSDIR}/hosts.sysctl" | \
sed -e 's/ / /g' | grep ^127.0.0.1 | tr -s [:space:] | tr -d "\r" | \
sed -e 's/127.0.0.1 /0.0.0.0 /g' | cut -d ' ' -f -2 | sort | uniq | \
grep -Ev ' dl.dropbox.com| dropbox.com| pastebin.com| www.pastebin.com' >> "${OUTPUTFILE}"
# hosts footer
echo -e "# end of file" >> "${OUTPUTFILE}"
Remember hosts.local is a must
# /etc/hosts: static lookup table for host names
#
#<ip> <hostname.domain.org> <hostname>
127.0.0.1 localhost.localdomain localhost YOUR_HOSTS_NAME_HERE
::1 localhost.localdomain localhost YOUR_HOSTS_NAME_HERE
# YOUR PERSONAL list
# blocked list
Create an alias in your ~/.bashrc
alias hu='sudo /root/.hostsupdate/hosts_update'
Execute:
# hu <enter>
Cheers.
Offline
fschiff wrote:Do people not like awk?
I have no idea. I'm a complete noob with Linux and Unix tools.
But any improvement is welcome. Awk syntax looks simple and clear. I hope also faster than the original.
el mariachi wrote:This is what I've done and it doesn't work
@el mariachi If you're sure that the /etc/hosts has in the first line "127.0.0.1 YOUR_HOST_NAME" then I don't know what the problem is. Maybe the problem is your DNS server.
yes I'm positive my dns how? I don't have a custom one set.
thanks for the help though
Offline
FWIW, I use urxvt and haven't noticed anything. Try using urxvtd + urxvtc?
Offline
that's what I'm using. So maybe I changed something?
Offline
Possibly. I haven't been able to find anything about urxvt reading /etc/hosts. It isn't in the man page anyway.
Offline
X also takes a loooong time to start
Offline
Some bugs fixed
#!/bin/bash
#
# Inspired in graysky script and HostMan
# Credit to Ontobelli for this script
# Version: 2012-02-26 15:35 UTC
#
# make hosts temporal directory
HOSTSDIR=~/.hostsupdate
mkdir -p "${HOSTSDIR}"
# set output file
OUTPUTFILE=/etc/hosts
if [ ! -f "${HOSTSDIR}/hosts.local" ]; then
echo "You need to create "${HOSTSDIR}/hosts.local" containing the hosts you wish to keep!"
exit 0
fi
# download hpHOSTS
wget -c -O "${HOSTSDIR}/hosts.hphosts" "http://hosts-file.net/download/hosts.txt"
# download hpHOSTS Partial
wget -c -O "${HOSTSDIR}/hosts.partial" "http://hosts-file.net/hphosts-partial.asp"
# download hpHOSTS ad/tracking servers
wget -c -O "${HOSTSDIR}/hosts.adservers" "http://hosts-file.net/ad_servers.asp"
# download the mvps.org hosts file.
wget -c -O "${HOSTSDIR}/hosts.mvps" "http://winhelp2002.mvps.org/hosts.txt"
# download the pgl.yoyo.org hosts Peter Lowe - AdServers
wget -c -O "${HOSTSDIR}/hosts.yoyo" "http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&showintro=1&mimetype=plaintext"
# download SysCtl Cameleon hosts
wget -c -O "${HOSTSDIR}/hosts.sysctl" "http://sysctl.org/cameleon/hosts"
# hosts header
cat "${HOSTSDIR}"/hosts.local > "${OUTPUTFILE}"
# hosts body
cat "${HOSTSDIR}/hosts.hphosts" "${HOSTSDIR}/hosts.adservers" "${HOSTSDIR}/hosts.partial" \
"${HOSTSDIR}/hosts.mvps" "${HOSTSDIR}/hosts.yoyo" "${HOSTSDIR}/hosts.sysctl" |
sed -e 's/ / /g' | tr -s [:space:] | tr -d "\r" | cut -d ' ' -f -2 | sort | uniq | \
grep -Ev '#|::1| dl.dropbox.com| dropbox.com| www.dropbox.com| pastebin.com| www.pastebin.com' | \
sed -e 's/127.0.0.1 /0.0.0.0 /g' >> "${OUTPUTFILE}"
# hosts footer
echo -e "# end of file" >> "${OUTPUTFILE}"
Last edited by ontobelli (2012-02-26 15:50:27)
Offline
Some bugs fixed
#!/bin/bash
#
# Inspired in graysky script and HostMan
# Credit to Ontobelli for this script
# Version: 2012-04-02 01:00 UTC
#
# make hosts temporal directory
HOSTSDIR=~/.hostsupdate
mkdir -p "${HOSTSDIR}"
# set output file
OUTPUTFILE=/etc/hosts
if [ ! -f "${HOSTSDIR}/hosts.local" ]; then
echo "You need to create "${HOSTSDIR}/hosts.local" containing the hosts you wish to keep!"
exit 0
fi
# download the mvps.org hosts file.
wget -c -O "${HOSTSDIR}/hosts.mvps" "http://winhelp2002.mvps.org/hosts.txt"
# download SysCtl Cameleon hosts
wget -c -O "${HOSTSDIR}/hosts.sysctl" "http://sysctl.org/cameleon/hosts"
# download hpHOSTS
wget -c -O "${HOSTSDIR}/hosts.hphosts" "http://hosts-file.net/download/hosts.txt"
# download hpHOSTS ad/tracking servers
wget -c -O "${HOSTSDIR}/hosts.adservers" "http://hosts-file.net/ad_servers.asp"
# download hpHOSTS Partial
wget -c -O "${HOSTSDIR}/hosts.partial" "http://hosts-file.net/hphosts-partial.asp"
#echo -e " " >> "${HOSTSDIR}/hosts.partial"
# download the pgl.yoyo.org hosts Peter Lowe - AdServers
wget -c -O "${HOSTSDIR}/hosts.yoyo" "http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext"
# hosts header
echo -e "127.0.0.1 localhost ${HOSTNAME}" > "${OUTPUTFILE}"
echo -e "::1 localhost ${HOSTNAME}" >> "${OUTPUTFILE}"
# hosts body
cat "${HOSTSDIR}/hosts.mvps" \
"${HOSTSDIR}/hosts.sysctl" \
"${HOSTSDIR}/hosts.hphosts" \
"${HOSTSDIR}/hosts.adservers" \
"${HOSTSDIR}/hosts.partial" \
"${HOSTSDIR}/hosts.yoyo" \
"${HOSTSDIR}/hosts.local" | \
sed -e 's/ / /g' | \
grep '127.0.0.1 ' | \
tr -d "\r" | \
tr -s [:space:] | \
cut -d ' ' -f -2 | \
sort | uniq | \
grep -vf "${HOSTSDIR}/hosts.whitelist" | \
sed -e 's/127.0.0.1 /0.0.0.0 /g' >> "${OUTPUTFILE}"
rm "${HOSTSDIR}/hosts.mvps" \
"${HOSTSDIR}/hosts.sysctl" \
"${HOSTSDIR}/hosts.hphosts" \
"${HOSTSDIR}/hosts.adservers" \
"${HOSTSDIR}/hosts.partial" \
"${HOSTSDIR}/hosts.yoyo"
You need to create a hosts.whitelist
here and example. Add the hosts you want to whitelist.
#
::1
127.0.0.1 127.0.0.1
127.0.0.1 localhost
127.0.0.1 dl.dropbox.com
127.0.0.1 dropbox.com
127.0.0.1 www.dropbox.com
127.0.0.1 pastebin.com
127.0.0.1 www.pastebin.com
127.0.0.1 tinyurl.com
Last edited by ontobelli (2012-04-02 15:48:06)
Offline
Thanks to ontobelli, I have made an "installation folder" using this script, with some minor modifications, as well as a couple of others from elsewhere.
I have naturally first tested this on myself ;-)
I didn't find any harmful effects, but saw some benefits.
As an inexperienced Linux user (for about a year, after several years of experimenting) I'm very much open to suggestions.
In deed I would also be very happy if someone more experienced and skilled in these could take this amateur work and make it something more like a "Hosts Manager Utility" and maintain it properly instead of me.
If you are interested, first you need to download a compressed file from here:
http://minus.com/mblIoq0NI6/1f
And then extract the "hosts-manager" folder in it, and start by reading the file "hosts-manager-readme.txt" first.
Kind regards,
Sadi
Offline
@Sadi
Very nice work. And a big step forward.
Testing it.
Offline
I've made an attempt to provide a GUI for it by means of Zenity as well as a couple of other improvements.
So here is version 0.91:
http://minus.com/mz7COXafc/1f
Offline
I didn't notice this thread until after I posted my own cronscript here: https://bbs.archlinux.org/viewtopic.php?id=139784
It has a couple of nice advantages, including being a little bit more flexible with your lists (just use "addurl (url)" instead of using wget -O etc for each entry), it preserves entries from your original target file, and it's non-interactive (i.e. can be used as a cronscript)
@el mariachi: One way to make up for any performance penalty from a hosts file is to use dnsmasq as a cache server (https://wiki.archlinux.org/index.php/Dn … ache_Setup)
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
@gaenserich: Thanks a lot for sharing.
Obviously there are a couple of things to study and learn from your script, especially your addurl and addzip subroutines.
But I wonder why you preferred curl to wget though.
For your information, I've also found a nice script at Puppy Linux forum, which uses gtkdialog3 and provides a nice GUI.
I'll paste it here as it is in a puppy package file, i.e. pet.
#!/bin/sh
#v0.4 created by sc0ttman, August 2010
#GPL license /usr/share/doc/legal/gpl-2.0.txt
#100830 BK added GPL license, amended Exit msg, bug fixes.
# advert blocker
# downloads a list of known advert servers
# then appends them to /etc/hosts so that
# many online adverts are blocked from sight
# make a hosts file if none found, or add a marker
if [ ! -f /etc/hosts ];then
echo "#host file
127.0.0.1 localhost puppypc
" > /etc/hosts
fi
# set vars
export appver='0.5'
export title='Pup Advert Blocker'
# the markers used to find the changes in /etc/hosts, which are made by this app
export markerstart='# pup-advert-blocker IPs below'
export markerend='# pup-advert-blocker IPs above'
export mvps='false'
export systcl='false'
export technobeta='false'
export yoyo='false'
# create functions
set -a
# cleanup all leftover files
cleanup () {
# remove all temp files
rm -f /tmp/adlist{1,2,3,4} /tmp/adlist-all /tmp/hosts-temp
}
# download the ads lists
download_adlist () {
# mvps
if [ "$mvps" = true ]; then
wget -c -4 -t 0 -T 10 -O /tmp/adlist1 'http://www.mvps.org/winhelp2002/hosts.txt'
fi
# systcl
if [ "$systcl" = true ]; then
wget -c -4 -t 0 -T 10 -O /tmp/adlist2 'http://sysctl.org/cameleon/hosts'
fi
# technobeta
if [ "$technobeta" = true ]; then
wget -c -4 -t 0 -T 10 -O /tmp/adlist3 'http://www.technobeta.com/download/urlfilter.ini'
fi
# yoyo
if [ "$yoyo" = true ]; then
wget -c -4 -t 0 -T 10 -O /tmp/adlist4 'http://pgl.yoyo.org/as/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext'
fi
#100830 BK bug fix: create if not exist...
touch /tmp/adlist{1,2,3,4}
# combine the downloaded lists, then sort and remove duplicates
cat /tmp/adlist{1,2,3,4} |grep ^[1-9] |sed "s/\t//g" |sort |uniq > /tmp/adlist-all
}
# clean out everything but the list of IPs and servers
clean_adlist () {
sed -i '/^#/d' /tmp/adlist-all # remove all comments
sed -i '/localhost/d' /tmp/adlist-all # remove the original links to localhost (we already have them)
sed -i '/^$/d' /tmp/adlist-all # remove empty lines
sed -i 's/\t/ /' /tmp/adlist-all # replace all tabs with spaces
sed -i 's/ / /g' /tmp/adlist-all # remove double spaces
dos2unix -u /tmp/adlist-all # change all carriage returns to UNIX format
# remove duplicates (again)
adlistall="`cat /tmp/adlist-all |sort | uniq`" #100830 BK
echo "$adlistall" > /tmp/adlist-all #100830 BK
}
# append the list to the /etc/hosts
append_adlist () {
# echo all but the stuff between the markers to a temp hosts file
sed -e "/$markerstart/,/$markerend/d" /etc/hosts > /tmp/hosts-temp
# remove the markers
sed -i -e "/$markerstart/d" /tmp/hosts-temp
sed -i -e "/$markerend/d" /tmp/hosts-temp
# check the size of the final adlist
# get contents of the downloaded adlist
adlist=`cat /tmp/adlist-all`
if [ ! -z "$adlist" ];then
# add list contents into the hosts file, below a marker (for easier removal)
echo "$markerstart" >> /tmp/hosts-temp
echo "$adlist" >> /tmp/hosts-temp
echo "$markerend" >> /tmp/hosts-temp
else
Xdialog --title "$title $appver" --msgbox "No ad lists selected. Ad blocking will be disabled." 0 0
fi
# replace the original with the cleaned version
mv "/tmp/hosts-temp" "/etc/hosts"
}
success () {
# tell user
Xdialog --title "$title $appver" --msgbox "Success - your settings have been changed.\n\nYour hosts file has been updated.\nRestart your browser to see the changes." 0 0 &
}
# create a GUI
export HELP_GUI='<window title="'$title' '$appver'">
<frame>
<vbox>
<text width-request="360">
<label> The "'$title'" tool adds stuff to your "/etc/hosts" file, so that many advertising servers and websites will not be able to connect to this PC.</label>
</text>
<text><label>""</label></text>
<text width-request="360">
<label>Leave your cursor over a service to see a short description. You can choose one service or combine multiple services for more advert protection.</label>
</text>
<text><label>""</label></text>
<text width-request="360">
<label>Blocking ad servers protects your privacy, saves you bandwidth, greatly improves web-browsing speeds and makes the internet much less annoying in general.</label>
</text>
<text><label>""</label></text>
<text width-request="360">
<label>'$title' '$appver', by sc0ttman</label>
</text>
</vbox>
<text><label>""</label></text>
<hbox>
<button tooltip-text="Close this help dialog">
<input file icon="gtk-quit"></input>
<label>Close</label>
<action>exec gtkdialog3 --program GUI --center &</action>
<action type="exit">EXIT_NOW</action>
</button>
</hbox>
</frame>
</window>'
export GUI='<window title="'$title' '$appver'">
<vbox>
<vbox homogeneous="true">
<frame>
<text><label>'$title'</label></text>
<text><label>Block online ads in all browsers with this simple tool</label></text>
</frame>
</vbox>
<vbox>
<frame>
<vbox>
<text>
<label>Choose your preferred ad blocking service(s)</label>
</text>
</vbox>
<hbox>
<checkbox tooltip-text="Blocks many known malware sites and unsafe adult networks">
<label>Mvps.org</label>
<variable>mvps</variable>
<default>false</default>
</checkbox>
<checkbox tooltip-text="A large, fairly comprehensive list of many known ad servers">
<label>Systcl.org</label>
<variable>systcl</variable>
<default>false</default>
</checkbox>
<checkbox tooltip-text="A smaller list of popup adverts, ad servers and ad networks">
<label>Technobeta.com</label>
<variable>technobeta</variable>
<default>false</default>
</checkbox>
<checkbox tooltip-text="A small and effective list of popular ad servers">
<label>Yoyo.org</label>
<variable>yoyo</variable>
<default>false</default>
</checkbox>
</hbox>
</frame>
<frame>
<vbox>
<hbox>
<text>
<label>Click the "Start" button to download and block the latest list of known advertising servers</label>
</text>
<button width-request="70" tooltip-text="Click to download and then block a list of advertising servers">
<variable>START</variable>
<input file icon="gtk-execute"></input>
<label>Start</label>
<action>download_adlist</action>
<action>clean_adlist</action>
<action>append_adlist</action>
<action>cleanup</action>
<action>success</action>
</button>
</hbox>
<text><label>""</label></text>
<hbox>
<text>
<label>Or click the "Edit" button to manually edit your hosts file, using your default text editor</label>
</text>
<button width-request="70" tooltip-text="Manually edit your hosts file in a text editor, adding or removing any entries you like">
<variable>EDIT</variable>
<input file icon="gtk-edit"></input>
<label>Edit</label>
<action>defaulttexteditor /etc/hosts &</action>
</button>
</hbox>
</vbox>
</frame>
<frame>
<hbox>
<button tooltip-text="Learn more about blocking ads">
<variable>HELP</variable>
<input file icon="gtk-help"></input>
<label>Help</label>
<action>exec gtkdialog3 --program HELP_GUI --center &</action>
<action type="exit">EXIT_NOW</action>
</button>
<button tooltip-text="Exit '$title'">
<variable>QUIT</variable>
<input file icon="gtk-quit"></input>
<label>Quit</label>
<action type="exit">EXIT_NOW</action>
</button>
</hbox>
</frame>
</vbox>
</vbox>
</window>'
# cleanup before start
cleanup
# run the program
gtkdialog3 --program GUI --center
Offline
I've made a couple of more improvements and decided that this is all I can do with the present level of skills, hoping that a more skilled volunteer can take it further from here.
Version 0.92 can be downloaded from here:
http://minus.com/mblIoq0NI6/4f
Last edited by sadi (2012-04-18 14:49:00)
Offline
@sadi Curl has a number of small advantages over whether (see http://daniel.haxx.se/docs/curl-vs-wget.html), plus it's included on os x, so i can use the same script on my wife's box.
I also didn't include a GUI since I use this script as a cronjob that runs once every day or week in the background.
Check out hostsblock for system-wide ad- and malware-blocking.
Offline
Just updated my own script, now available via the aur. See https://bbs.archlinux.org/viewtopic.php?id=139784
Check out hostsblock for system-wide ad- and malware-blocking.
Offline