Project

General

Profile

Actions

Feature #75

open

Feature #145: Improve Epiphany user's experience

Feature #72: Hide ads on web pages

Host a search engine cleaning page

Added by Jean-Michel Philippe about 13 years ago. Updated almost 12 years ago.

Status:
Ready for test
Priority:
Normal
Category:
System
Target version:
Start date:
12/12/2011
Due date:
09/30/2012 (over 11 years late)
% Done:

80%

Estimated time:
6:00 h
Spent time:

Description

Search engines like Google provide advertisements in their results as texts. Some of them can be confused with the actual search results. We could then host our own search page that would just get the Google result then return a page cleaned of all ads. This custom search page should be programmed in Php and use some Xml parsing tools or regular expressions to extract the content of interest.

The search page could then be accessed using the following kind of URL:

http://search.doudoulinux.org/?lang=fr&query=epiphany+addblock

Actions #1

Updated by Jean-Michel Philippe almost 13 years ago

  • Due date changed from 08/22/2011 to 11/21/2011
  • Target version changed from 2011-08 to 2011-11
  • Start date changed from 06/20/2011 to 10/01/2011

There are open, decentralized search engine initiatives:

We could then imagine to host in the end our own search engine node, then use it as the default DDL search engine. It is then more reasonable to delay this task.

Actions #2

Updated by Jean-Michel Philippe over 12 years ago

Yacy seems to be in its early stages and should not be used until enough peers join its network (and bugs are fixed :( ). In the meantime we can use Seeks which gives quite good results without any ads and without tracking users. Moreover we don't have a contract with Google yet ;).

Actions #3

Updated by Jean-Michel Philippe over 12 years ago

  • Status changed from New to In Progress
  • Assignee set to Jean-Michel Philippe
  • % Done changed from 0 to 10

NB: if we use Seeks then no need to write our own search result cleaner :).

Actions #4

Updated by Jean-Michel Philippe over 12 years ago

  • Due date changed from 11/21/2011 to 02/20/2012
  • Target version changed from 2011-11 to 2012-02
  • Start date changed from 10/01/2011 to 12/12/2011
Actions #5

Updated by Jean-Michel Philippe about 12 years ago

  • Target version changed from 2012-02 to 2012-05
  • % Done changed from 10 to 50

As a workaround, we'll use DuckDuckGo while waiting for a more satisfying solution (= not centralized and not dependent of ads).

Actions #6

Updated by Jean-Michel Philippe almost 12 years ago

There are several issues with DuckDuckGo:

  • the standard version makes DansGuardian fail because search results are displayed one by one using Ajax calls
  • advertisement is displayed in the search results
  • results are filtered in a way we don't know when safe results are activated (defaults)

However, using URL parameters we can pass through these difficulties and use a standard Html result page that works with DansGuardian. For French, we have to use the following search request:

http://duckduckgo.com/html/?q=sexe&kl=fr-fr&k1=-1&kp=-1

While the standard DDG page isn't blocked by DansGuardian, using this one leads to a blocked page with a score 4645 for 50 allowed! Keeping the safe search on reached a score of 1005. We then only have to add the special parameter telling we come from DDL :).

Actions #7

Updated by Jean-Michel Philippe almost 12 years ago

  • Due date changed from 02/20/2012 to 09/30/2012
  • Status changed from In Progress to Ready for test
  • Target version changed from 2012-05 to 2012-08
  • % Done changed from 50 to 80

A Debian package has been built with the correct settings for DDG.

Actions

Also available in: Atom PDF