WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: Use /etc/host for ad-blocking?  (Read 4770 times)

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Use /etc/host for ad-blocking?
« on: December 23, 2015, 06:21:18 AM »
I found somewhere the above script (which i adapted a bit because of a slight difference between wget and busybox-wget) that gathers blacklists to an /etc/hosts file which then sends cross-scripting and advertising sites to 0.0.0.0 i.e. a kind of blackbox.

My question to you more skillfull people: Does that seem to be a reasonable way for adblocking?

Actually, i'm doing it with a forbidden file (from within polipo, which is much more limited but both methods live together nicely). I prefer to not use ad-blocker add-ons in firefox because they're big memory "eaters". And moreover, in this way, also other apps relying on hosts (like fifth e.g.) use it.

Offline Misalf

  • Hero Member
  • *****
  • Posts: 1702
Re: Use /etc/host for ad-blocking?
« Reply #1 on: December 23, 2015, 07:09:54 AM »
First, you posted this in the wrong section. That's either related to Base or Off Topic. Not extensions.

--

I believe this technique it a reasonable way to block certain sites, provided you don't put too much trust in the lists that get downloaded because they cannot blacklist every bad site.
MVPS claims to not block everything since there are ad sites that don't do any harm but help certain sites to provide their features for free to the users.

The script needs GNU wget it seems (at least for  hosts-file.net ).
The script doesn't need bash (i.e. you can change  #!/bin/bash  to  #!/bin/sh ).

The script does redundant work by removing entries NOT starting with 127.0.0.1 while the list from MVPS already uses 0.0.0.0 so every entry from MVPS will be removed.
Change this
Code: [Select]
sed -e 's/\r//' -e '/^127.0.0.1/!d' ...
to this
Code: [Select]
sed -e 's/\r//' -e '/^[127.0.0.1|0.0.0.0]/!d' ...

--

Beware that you would need to run this script after each boot since Core rewrites  /etc/hosts  via  sethostname .
Download a copy and keep it handy: Core book ;)

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Re: Use /etc/host for ad-blocking?
« Reply #2 on: December 23, 2015, 08:16:50 AM »
Thanks a lot for your help and suggestions. So, generally, it's a viable strategy :D

First, you posted this in the wrong section. That's either related to Base or Off Topic. Not extensions.

Sorry  :-[

I believe this technique it a reasonable way to block certain sites, provided you don't put too much trust in the lists that get downloaded because they cannot blacklist every bad site.
MVPS claims to not block everything since there are ad sites that don't do any harm but help certain sites to provide their features for free to the users.

The script needs GNU wget it seems (at least for  hosts-file.net ).
The script doesn't need bash (i.e. you can change  #!/bin/bash  to  #!/bin/sh ).

The script does redundant work by removing entries NOT starting with 127.0.0.1 while the list from MVPS already uses 0.0.0.0 so every entry from MVPS will be removed.
Change this
Code: [Select]
sed -e 's/\r//' -e '/^127.0.0.1/!d' ...
to this
Code: [Select]
sed -e 's/\r//' -e '/^[127.0.0.1|0.0.0.0]/!d' ...

--

Beware that you would need to run this script after each boot since Core rewrites  /etc/hosts  via  sethostname .

I tried it and apparently it works ... Thanks for the improving, i'll try to apply this.
...
To save hosts i could use, what you suggested in another post, correct.
« Last Edit: December 23, 2015, 08:46:35 AM by emninger »

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Re: Use /etc/host for ad-blocking?
« Reply #3 on: December 23, 2015, 10:58:29 AM »
Add-on:

In case of the above script the command in bootsync.sh would be:

Code: [Select]
cat /home/tc/block-hosts >> /etc/hosts(Is it that 'cat' adds the content of block-hosts to /etc/hosts?)

Then eventually the command should be:
Code: [Select]
cp /home/tc/block-hosts /etc/hosts(since block hosts already contains the default hosts content) ?

Or, may be, it'd be better to adapt the script to not create a complete hosts, but only the "blacklist" which the goes added?

Offline Misalf

  • Hero Member
  • *****
  • Posts: 1702
Re: Use /etc/host for ad-blocking?
« Reply #4 on: December 23, 2015, 11:50:06 AM »
Yes, ">>" appends to a file, leaving its current content intact.
Yes, since the script you attached already merges the original hosts file with the downloaded lists, you should replace the original with the new one.
I don't know if replacing or appending is better for you.
Download a copy and keep it handy: Core book ;)

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Re: Use /etc/host for ad-blocking?
« Reply #5 on: December 23, 2015, 12:34:53 PM »
I changed the script in this way:

Before:
Code: [Select]
# Combine system hosts with adblocks
echo Merging with original system hosts...
echo -e "\n# Ad blocking hosts generated "$(date) | cat ~/hosts-system - $temphosts2 > ~/hosts-block

# Clean up temp files and remind user to copy new file
echo "Cleaning up..."
rm $temphosts1 $temphosts2
echo "Done."
echo
echo "Copy ad-blocking hosts file with this command:"
echo " sudo cp ~/hosts-block /etc/hosts"
echo
echo "You can always restore your original hosts file with this command:"
echo " sudo cp ~/hosts-system /etc/hosts"
echo "so don't delete that file! (It's saved read-only for your protection.)"
echo

Now:
Code: [Select]
# Combine system hosts with adblocks
#echo Merging with original system hosts...
#echo -e "\n# Ad blocking hosts generated "$(date) | cat ~/hosts-system - $temphosts2 > ~/hosts-block

# In TinyCoreLinux /etc/hosts  is recreated at boot by  /usr/bin/sethostname
# therefore no need to merge blacklist with original hosts. In bootsync.sh
# that is done by: 'cat ~/blocklist >> /etc/hosts'
cp $temphosts2 blocklist

# Clean up temp files and remind user to copy new file
echo "Cleaning up..."
rm $temphosts1 $temphosts2
echo "Done."
echo

Seems to work ... ;)

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Re: Use /etc/host for ad-blocking?
« Reply #6 on: December 24, 2015, 01:40:59 AM »

[...]

The script needs GNU wget it seems (at least for  hosts-file.net ).

[...]

Hi Misalf!

I wanted to turn back on this: Indeed doing
Code: [Select]
wget -q -O - http://hosts-file.net/ad_servers.asp i get an error prompt
Code: [Select]
wget: bad address '.%5Cad_servers.txt'.

Now, i'm trying to understand how and why ... ;) And if there is, eventually a way to play around this problem (?)
Guessing in the dark, i went to the address in the browser and copied the link to the host file, which looks like this:

Code: [Select]
http://hosts-file.net/.%5Cad_servers.txt
When i do "wget http://hosts-file.net/.%5Cad_servers.txt" it works. But may be, that's a "static" link ... (?)

PS. As i understand it, TCL uses busybox, so if i do 'wget' in a shell in reality it is 'busybox wget' (because TCL somewhere - where? - defined wget as a symlink to busybox wget). Is that correct?

« Last Edit: December 24, 2015, 01:51:11 AM by emninger »

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11619
Re: Use /etc/host for ad-blocking?
« Reply #7 on: December 24, 2015, 01:50:23 AM »
Hi emninger
Install  wget.tcz.

Offline coreplayer2

  • Hero Member
  • *****
  • Posts: 3020
Re: Use /etc/host for ad-blocking?
« Reply #8 on: December 24, 2015, 02:11:50 AM »
Somewhere in this forum is a thread on updating the host file from from the good folks at putorius.net
Maybe it's the same script..?  this one works fine.  Update and add etc/hosts to your backup
I've been experimenting with this, but it adds a considerably large file to my backup so tend not to use it.   
it's hard to tell the difference as I hardly every get ad's anyhow

Code: [Select]
~ $ ./HostFileUpdate.sh
Connecting to winhelp2002.mvps.org (216.155.126.40:80)
hosts.txt            100% |************************************************|   498k  0:00:00 ETA
~ $

Code: [Select]
#!/bin/sh
## Update hosts file
## http://www.putorius.net/2012/01/block-unwanted-advertisements-on.html

## backup old hsts file
# cp /etc/hosts ~/.hosts_bak


if [ -f ~/.hosts_bak ]; then
cd /tmp
wget http://winhelp2002.mvps.org/hosts.txt
sudo rm /etc/hosts
sudo mv hosts.txt /etc/hosts
sudo cat ~/.hosts_bak >> /etc/hosts
else
echo "No hosts backup found"
fi
exit


Good luck
« Last Edit: December 24, 2015, 02:13:45 AM by coreplayer2 »

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11619
Re: Use /etc/host for ad-blocking?
« Reply #9 on: December 24, 2015, 02:12:42 AM »
Hi emninger
Quote
PS. As i understand it, TCL uses busybox, so if i do 'wget' in a shell in reality it is 'busybox wget' (because TCL somewhere - where? - defined wget as a symlink to busybox wget). Is that correct?
Yes,  wget  is a symlink in  /bin  which points to busybox. If you install  wget.tcz  then  wget  will also be found in
/usr/local/bin. Since  /usr/local/bin  appears in the  path  variable before  /bin  it will default to using that version if you
enter  wget.

Offline Misalf

  • Hero Member
  • *****
  • Posts: 1702
Re: Use /etc/host for ad-blocking?
« Reply #10 on: December 24, 2015, 06:19:23 AM »
Update and add etc/hosts to your backup
Looking at  tc-config  and  sethostname  it seems one could just remove sethostname from bootsync.sh if there is no need to change the host name (which could still be done via host= bootcode) as long as a valid /etc/hosts file is included in the backup.
Otherwise  sethostname  recreates /etc/hosts with default values after the backup was restored.
Download a copy and keep it handy: Core book ;)

Offline Misalf

  • Hero Member
  • *****
  • Posts: 1702
Re: Use /etc/host for ad-blocking?
« Reply #11 on: December 24, 2015, 07:27:21 AM »
If the hostname is set via bootcode, the backed-up hosts file should probably have the same value in the localhost line.
Download a copy and keep it handy: Core book ;)

Offline emninger

  • Sr. Member
  • ****
  • Posts: 267
Re: Use /etc/host for ad-blocking?
« Reply #12 on: December 24, 2015, 08:28:06 AM »
Somewhere in this forum is a thread on updating the host file from from the good folks at putorius.net
Maybe it's the same script..?  this one works fine.  Update and add etc/hosts to your backup
I've been experimenting with this, but it adds a considerably large file to my backup so tend not to use it.   
it's hard to tell the difference as I hardly every get ad's anyhow

Code: [Select]
~ $ ./HostFileUpdate.sh
Connecting to winhelp2002.mvps.org (216.155.126.40:80)
hosts.txt            100% |************************************************|   498k  0:00:00 ETA
~ $

Code: [Select]
#!/bin/sh
## Update hosts file
## http://www.putorius.net/2012/01/block-unwanted-advertisements-on.html

## backup old hsts file
# cp /etc/hosts ~/.hosts_bak


if [ -f ~/.hosts_bak ]; then
cd /tmp
wget http://winhelp2002.mvps.org/hosts.txt
sudo rm /etc/hosts
sudo mv hosts.txt /etc/hosts
sudo cat ~/.hosts_bak >> /etc/hosts
else
echo "No hosts backup found"
fi
exit


Good luck

Thanks! As far as i see, the script i found (in the kubuntuforums btw) is even wider using not only mvps but also other sources of host lists and merging them.

In the way, Misalf taught me, i.e. using cat ~/<hostlistfilename> >> /etc/hosts in bootsync, i do not even need a backup (my home is set persistent via bootcode): I do not have to touch the default procedure of setting the host name, works with busybox wget - no need of GNU wget -  and it works flawlessly ... (i'll attach the new script).

Offline Misalf

  • Hero Member
  • *****
  • Posts: 1702
Re: Use /etc/host for ad-blocking?
« Reply #13 on: March 11, 2016, 10:09:30 PM »
I've changed a lot of things to make the script more reliable, especially related to my slow internet connection. Having wget.tcz loaded works better than busybox wget as GNU wget can continue a download even if output is STDOUT. Both will do fine though if it's faster than 56kbit. ;)
Download a copy and keep it handy: Core book ;)