Search engine bias is particularly insidious

Search engine bias is particularly insidious

Larry Page

“Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users.  …., we expect that advertising funded search engines will be inherently biased towards theadvertisers and away from the needs of the consumers. ….., search engine bias is particularly insidious.”

Larry Page and Sergey Brin – 1998

These are direct quotations taken from a 1998 paper by Sergey Brin and Larry Page [1], the founders of Google, when their company was in its infancy. The quotes can all be found in “Appendix A: Advertising and Mixed Motives” of the paper.  I agree 100% with the quoted statements but wonder if Brin and Page are still aligned to these, and has Google abandoned their original ideals to become what some may consider to be a “particularly insidious” behemoth?

I hope to be able to post some details soon of one project I am investing a lot of time in that will remove advertising from the search business model and deliver quality user-focussed search with no commercial bias.  So watch this space!

[1] The Anatomy of a Large-Scale Hypertextual Web Search Engine, Sergey Brin and Lawrence Page, Stanford University, 1998,


Goodbye Google, Hello Qwant!

Goodbye Google, Hello Qwant!

This is a guest blog post by Jack Warner of TechWarn.

The French government bids farewell to Google and adopts Qwant.

At the end of last year, the French National Assembly announced that they are not going to use Google anymore. All the devices belonging to the French government will adopt a new search engine called Qwant, which is privacy-focused.

It seems the French government wants to protect itself from the surveillance of Google and the U.S. government (and their associated comprehensive data-retention policies) and are prepared to take some big steps to do so.

France’s watchdog for data protection has also fined Alphabet’s Google $57 million (€50 million) on 14th of January, 2019 for breaches of the EU’s online privacy rules. This penalty is the biggest ever to be levied to tech giant like Google in the U.S under the EUs new GDPR legislation.

Advantages of using a private search engine

You can enjoy the benefits of search engines which are privacy-oriented in the following ways:

  1. Be invisible – private search engines will use aggregate or non-personal search data, instead of storing your IP address. This will keep your identity private.
  2. Be clean – when you have finished the search, the history on your browser will expire. Therefore, third-parties cannot access your data or search terms, even if they managed to access your computer.
  3. Be free – there will be no targeted ads when you use a private search engine as they don’t run ad campaigns like other mainstream search engines usually do. The ads are merely related to your search term instead of your browsing history.

Top recommendations for private search engines

The following are some of the best privacy-oriented search engines available right now:

  • Qwant: a pioneer of private search engines which was the first based in Europe.
  • Startpage: a private search engine which doesn’t log, share, and track the personal data of its user.
  • DuckDuckGo: this private search engine has a policy which prohibits the sharing or collecting of users’ personal information.

Private search engines versus incognito mode

Let’s backtrack slightly – why use a private search engine when there is incognito mode? The incognito mode usually behaves like a brand-new browser which has just been installed on your computer. It does not have cookies, saved searches, bookmarks, or any pre-filled forms. As you close an incognito browser, all the information that is usually collected by the browser will be deleted. Even so, the incognito mode cannot stop third parties from tracking you via DNS records and collecting your data as there is no encryption. Also, since incognito mode is also a Google product that’s not open-sourced, who knows if it collects our data for its mother ship?

On the other hand, a private search engine offers added layers of protection and its users will not be tracked while it delivers the search results. Some even use local encryption for extra privacy protection.

The problem with Google

Google privacy breaches are not unheard of. From your search history to your browsing behavior, Google knows it all. If you stand by your right to remain private and anonymous online, you may want to start looking for alternative privacy-oriented search engines.

Despite the company’s size and industry influence, Google still managed to commit a data breach that seriously violated corporate compliance last March. The shocking breach was discovered early last year when the personal data of 500,000 Google+ users was exposed to hundreds of third-party developers working on new apps on the network. The exposed data includes usernames, birth dates, email addresses, work histories, and photos. It’s possible that more users were exposed as the glitch is believed to have existed since 2015 according to engineers at Google+.

Imagine the damage if all that you’ve trusted in Google to keep private, such as your search history, browsing habits recorded on the Chrome browser and traffic from your Google Pixel, was leaked accidentally. Not only does Google violate your privacy by collecting more data than it should, but it also doesn’t do a good job of protecting it. It could be time to look beyond the horizon for newer and better options.

Jack Warner - TechWarnAbout the author

Jack Warner is an accomplished cybersecurity expert with years of experience under his belt at TechWarn, a trusted digital agency to world-class cybersecurity companies. A passionate digital safety advocate himself, Jack frequently contributes to tech blogs and digital media sharing expert insights on topics such as whistleblowing and cybersecurity tools.

Can blockchain solve the problem of trust?

Can blockchain solve the problem of trust?

The answer is no! … Okay, that would be a very short read so maybe I should provide the reasoning behind this short answer.  Also, to avoid appearing too negative on the issue, I should first state that I do believe that blockchain has a lot to contribute positively around the issue of trust.

The internet has connected the world (well over half of the world’s population can now access the internet) we can communicate and transact with almost anyone on the planet. The problem is that a small proportion of the people feeding us information or attempting to transact with us cannot be trusted. So, the internet may well have solved a worldwide communication problem but in so doing a large problem of trust has emerged – i.e. we can now connect with billions of people we don’t know, we know that some of them will be dishonest, but we don’t know who specifically we can trust. The traditional way to address the “who do we trust” problem is to use people we do trust as intermediaries to check information or to police transactions and, of course, this is expensive and time consuming.

Today with the use of blockchain we can now make financial transaction successfully without using a trusted intermediary – except we are using blockchain’s distributed ledger as the trusted party. We trust the ledger because it is distributed, transparent and the code that created it is open for verification. In the case of bitcoin the code is relatively simple and generally trusted, however as blockchain is now being used for more involved tasks, the code can become complex and will inevitably require updates to fix bugs and improve functionality – thus it is helpful to have a trusted party with the authority to make such changes (albeit with high levels of transparency through visibility of the code).

The holy grail for many blockchain evangelist is the Decentralised Autonomous Organisation (DAO) where the organisation hierarchy is flat (i.e. there is no board or government that sits at the top of a hierarchical pyramid to manage the organisation). The governance is automatically orchestrated by smart contracts, the evolution of the organisation is through the members (typically token holders) having voting power and any changes to the organisation would be by consensus of the members. Even fixing a bug in the constitutional code requires new code to be proposed, and that a consensus of members agree to accept the new code. Reaching consensus quickly on complex issues in a dynamic organisation is not always practical and so the ideology begins to crumble somewhat. In the future entire countries could perhaps be run as DAOs, but today’s technology is simply not advanced enough to allow the removal of key decision makers.

For complex systems to be based on blockchain today there needs to be many trusted elements in the system, and completely flat hierarchy organisations are not yet practical for all but the simplest of systems run by smart contracts. This might not sit well with many of today’s blockchain evangelists, but the reality is that we have shifted a problem of “who do we trust” to a problem of “do we trust this code” – or even do we trust the person that wrote the code. The public visibility of code and consensus to accept it helps greatly. However, most software developers would agree that you cannot always trust your own code to do exactly what was intended once it has reached more than a few tens of lines, even with rigorous reviews and testing (just ask NASA!).

So, it seems we cannot eliminate the problem of involving a trusted party, but we can greatly improve the confidence in the trusted party through transparency and consensus. What this means is that if a trusted party commits a dishonest or negligent act that is not immediately detected, there may not be a direct mechanism to stop the impact of this, but there is an indirect mechanism that will reveal wrongdoing or incompetence and therefore encourages the trusted party to act honestly and diligently in the first place.

Let me use a true story to illustrate this in practice. Estonia is believed to have the most advanced digital society in the world. Everyone has a digital identity and access to most of their own digital information. Access to certain sensitive personal information is only permitted for legitimate reasons by certain parties and all data access is recorded on an immutable ledger. An Estonian police officer once decided to use his privileged access to citizen data for the purposes of checking on his partner. He had no right to view her data for this purpose but had been trusted with access to this system. His partner was able to see that her records had been accessed by the police on a particular day. She questioned the reason for this and on investigation by the authorities it was revealed that it was her partner that had looked at her records. His improper actions had been recorded on a trusted ledger and resulted in him being sent to prison. This story has even been retold by their president and serves as a powerful deterrent to any other trusted person in Estonia considering abusing their privileged access to personal data.

So, although a system with this type of transparency may not, in all instances, stop improper behaviour by trusted parties, the fact that the wrongdoing could be easily detected reduces the likelihood of future abuses and increases trust. In a nutshell, transparency becomes the basis for trust; and the blockchain’s immutable distributed ledgers and open source code are excellent for transparency.

In summary, blockchain is not a panacea for trust. In a simple sense it seems to be a partial solution to the problem of distrust. i.e. we can actually transact directly with untrusted individuals without a trusted intermediary if we can trust the blockchain technology that oversees the transaction. Blockchain and smart contracts cannot be inherently trusted – they must establish trust in the architects, the coders the participants, and anyone involved in the implementation and operation of the system. However, the transparency and consensus mechanisms built into the technology are a significant aid to building that trust.

When will Google’s search engine monopoly be broken?

When will Google’s search engine monopoly be broken?

In the 1990s I was an academic at the University of Edinburgh teaching electronics and computing. In 1998 a colleague introduced me to the Google search engine and as a result I made the shift from my previously preferred search engine, Alta Vista, to Google. Twenty years on, despite trying many others, I still use Google regularly. I was delighted by the Google experience 20 years ago, but today I dislike Google search for several reasons but I have yet to find a noticeably better alternative.

In a recent poll conducted by the Search Engine Journal (SEJ) it was revealed that that Duck Duck Go (DDG) is the most popular alternative to Google (based on 1097 SEJ Twitter followers who responded). However, according to Statcounter DDG have just 0.32% (October 2018) of the search market and Google have 92.74%. My question is whether Google’s monopoly could be under threat by emerging search engines such as DDG and if so, why have they not yet made a major impact on Google’s market share?


Google has held a monopoly position for around 20 years and in that time ‘to Google’ has become a verb. This incumbent position will not be easy to shift. DDG has a very similar product to Google it has not made much of a dent in the market. Their major differentiator is that it respects user privacy. However, if we are honest it does not deliver truly better search results, in fact in my experience the results are generally less good – and are still tainted by significant volumes of undesirable advertising content. Thus, DDGs market is a niche market segment that believe user privacy is essential and are willing to accept slightly inferior search results to achieve this. This market sector (which I include myself in) is huge in world terms – but remains a niche.

My shift from Alta Vista to Google some 20 years ago was based on it being a better solution to my internet search problem — data privacy was not generally thought about in 1998. For me DDG is not yet a better solution, if better results and less advertising (which I know is how they make money) was delivered alonside its enhanced privacy, then it could be a winner. The other problem for DDG is that the federated results use Bing as a major source of data, so they do not have an independent platform from which they can base an assault on Google.

My current opinion is that Google will slowly lose market share to a variety of small search engines such as DDG. One of these players will emerge as a future challenger to Google within the next 5 years and we may not even have heard of them yet. The successful search engine will likely do their own crawling and indexing so they are not dependent on Bing, Google or Yahoo and they will not rely on pay-per-click (PPC) advertising which most users dislike. It is quite probable that when they do emerge as a leading contender they will be acquired and scaled by a company with a large existing subscriber base and complimentary business activities such as Amazon.

Avoiding the Filter Bubble

Avoiding the Filter Bubble

For those not familiar with the ‘Filter Bubble’, the term was coined by Internet activist Eli Pariser around 2010 and refers to a state of intellectual isolation resulting from personalisation applied to the delivery of web content. The suggestion is that users become separated from information that disagrees with their viewpoints. This effectively isolates individuals within their own cultural or ideological filter bubble.

While we could debate at length the extent to which this is a problem, few would deny its existence. In his 2011 TED Talk Eli Pariser suggests that search engines should curate results according to the following criteria, not just by relevance:

Sort by

I personally don’t fully subscribe to this view. It requires that the search algorithms will somehow curate the results according to these ethics. Thus, we have to trust someone to decide what ethical parameters are good for us and code these into the algorithms! Or, of course we could go the other way and completely remove personalisation from any results and some search engines, such as Duck Duck Go, do not personalise results and so offer us this option. However, we then lose the ability to have more relevant results ranked higher and so more manual sifting of results by the user is needed. My view is that the Filter Bubble can be largely avoided if we provide visibility of the personalisation process and control over how our personal data is being applied.

Intellectual isolation resulting from the delivery of personalised content is clearly not peculiar to the Internet. In the western world we have the choices everyday of who to listen to, what to read, the media channels to watch and the ability to switch these on and off at will. If one chooses only to read or listen to right-wing views, then this is a personal right. Each individual is at least aware that not everyone shares these same views, and they can see that alternative reading and viewing material is available. Where Internet personalisation has a real problem is that this isolation can happen (indeed is happening) without there being visibility or self-awareness of it. It becomes especially dangerous if it is used to alter opinions for financial or political gain, and we are now aware of previous abuses of influence in recent elections and referendums.

I believe that users should be given visibility of the data they are revealing for personalisation. This transparency gives the ability for self-observation (see your digital-self as revealed to others), self-measurements (revealing of statistics, trends and actionable insights), and self-comparisons (how you measure up relative to your peers). This can theoretically be achieved using existing rights to data portability under GDPR and would be a useful first step in avoiding the Filter Bubble.

The next step would then be the ability to give the user control over the degree of personalisation delivered by any service. The user regulates the cost-benefit of revealing personal information and sees how the quality of content delivered is being influenced by this. In reality, if someone wishes to live in a state of intellectual isolation this should be a choice they have made, not a situation that they are unaware of or that has been imposed upon them.

The California Consumer Privacy Act – CCPA

The California Consumer Privacy Act – CCPA

Oh no, more data privacy legislation is coming! The California Consumer Privacy Act (CCPA) will kick in on 1st January 2020. For many of us the key questions are; what is the difference between CCPA and GDPR, and will it affect me?

The short answers are that it has similarities to GDPR but is different in several aspects, and secondly – in theory it should only really affect you if you are a citizen of California, or if you are a large company that processes personal information that includes one or more citizens from California. However, its reach will be much broader than this since companies are unlikely to adopt different policies exclusively for their Californian customers.

I am going to elaborate on these points, but still want to keep it short and so have generalised a little – you can read about it in more detail from the links I will give at the bottom.

How will CCPA affect us?

Most individual citizens may not notice much impact from CCPA, even as it comes near to the enforcement date. We may see a small email storm but less than we suffered for GDPR. The reason for this is that CCPA is largely based on opt-out rather than GDPR’s opt-in.  American citizens (especially those in California) are likely to see more activity prior to implementation. For most of us, we will be told that service terms & conditions are being updated, but then this happens regularly anyway and few of us read these before agreeing.

After implementation one change we may observe is the appearance of opt-outs regarding the sale of data – a “Do Not Sell My Personal Information” option as default – you will need to select this in order to opt-out. In addition, those aged 13-16 should automatically be opted out and so “I am over 16” might also be ticked as a default. Those under 13 will require parental consent to allow their data to be sold. Interestingly a company could sell the information of a minor without consent if they do not have “actual knowledge” that the consumer is under 16 so it will be interesting to see how this will be interpreted in practice.

In the short-term it is companies with Californian customers that will be most affected. However, the Act only applies to companies of a certain size (turnover of $25 million or more), or that hold personal information on more than 50,000 consumers (or devices). Thus, it will be large companies, most of whom will already have dealt with GDPR compliance, that will be most affected. It is likely that these companies will develop policies for customers that jointly covers GDPR and CCPA.

So, will citizens benefit from CCPA? If we are already under GDPR then I can see little if any benefit. If you are subscriber to services that do not need to be GDPR compliant (e.g. a US citizen who’s personal information is held in the USA) then CCPA can give you GDPR-like rights and benefits regarding the use of your Personal Information. However, CCPA is not the same as GDPR and in many ways it is weaker.

The main differences between CCPA and GDPR

Who it applies to: CCPA only applies to citizens of California, and of course it will apply to companies that hold personal information on any citizen of California and so it will have global impact. GDPR applies to companies operating in the EU, it also applies to companies that hold or process data on EU citizens so GDPR has a greater global impact.

What does it apply to: The Act applies to “personal information”. The definition is quite wide and similar to GDPR but slightly wider in the sense that if “households” can be identified from the data then it is considered “personal information” even although an individual is not identified.

Who is regulated: CCPA only applies to larger for-profit companies that process or hold significant “personal information”. The company will have a turnover of above $25 million; or process/hold personal information of 50,000 or more individuals, households or devices; or make more than 50% of revenues from selling personal information. Any business that is not in California and does not use information on the states citizens is completely exempt. GDPR on the other hand applies to almost all companies (no matter the size) that use personal information in the EU or related to EU citizens.

What rights does it give: Consumers protected by CCPA are entitles to be given notice about the categories of information being collected and the business purpose for which it is being collected, plus any intention to sell this information with the option to opt-out of this sale. Note that CCPA relies on an opt-out policy, whereas GDPR is opt-in. The customer has the right to be told what type of information is being held, although in practice this might be a boilerplate list. The customer may request that their personal information is deleted (and there are a few exemptions to this) so this is similar to, but not quite the same as, GDPR’s ‘right to be forgotten’. CCPA has no direct equivalent to GDPR’s data portability i.e. the right to request a copy of your personal information. To comply a company only needs to disclose information about what has been collected over the last 12 months but the Act does not seem to provide an explicit right to obtain a full copy of the actual data itself. All customers must be treated equally under the Act meaning that a request made under the Act cannot be used as a reason to alter any terms or pricing for that customer.

Penalties:  Under CCPA damages of $100-750 per consumer per incident are applicable. For GDPR the penalties can be Up to €20 million, or 4% annual global turnover – whichever is higher. So, although different, both can apply severe penalties.

Read more …

This has been a very superficial look at CCPA, but hopefully its conciseness makes it useful and readable. Here are a couple of articles that provide more detailed information that I found useful in trying to understand CCPA.

CCPA: What Marketers Need to Know about the California Consumer Privacy Act

CCPA and GDPR: Comparison of certain provisions