In recent months, the companies behind the world's two leading search engines have decided that they have no choice but to bet on large language models (LLMs).
Related stories:
Microsoft was the first to adopt the trend of artificial intelligence, thanks to its partnership and investment in OpenAI. Google could not hold back and, in order to prove that it is not losing its grip on the search market, released its own experimental product called Bard.
So what happens when both Bing and Bard are asked the same questions? Unlike OpenAI's ChatGPT, which is still in the experimental phase, both of these products already have a wide distribution and are connected to the internet. They combine language models with network access to provide answers to users’ questions.
It's also important to note that both products undergo frequent upgrades. So where do these products excel, where do they fail spectacularly, and are there any noticeable differences between them? Let's take a closer look.
Recipes
What happens when you ask to provide Bing and Bard with the same recipe-related prompts? In recent years, the world of search engines has faced a major challenge: to achieve high rankings on Google and to optimize the location of the recipe in search results, many recipes turn into long-winded gobbledygook, with the recipe itself only appearing at the very bottom.
In our little test, both Bing and Bard excelled in summarizing the content and presenting the recipe itself, only if it was part of the database they were trained on.
What does this mean? It's not advisable to ask them to generate a recipe for you from scratch. Rather, if asking for recipes, the models must summarize recipes that have already appeared in existing texts, or those that they can access.
However, there's a conceptual problem with Bing's approach. Although Bing does refer to source materials from which it summarized, in the case of recipes, there is no real reason for you to access them and expose yourself to the ads that are displayed therein, thus benefiting those who wrote these recipes. This can be considered a type of plagiarism that harms the creators of these recipes.
On the other hand, Bard does not refer to its sources in this case at all. It does try to enrich the user experience with ideas for seasonings, even if we did not request it. Whether this is a good thing or not, you can decide for yourselves.
The Israel angle
Both Bing and Bard acknowledge that the capital of Israel is Jerusalem, but at the same time, note that its status is largely disputed by the international community. However, while Bing stops there, Bard continues little tirade. And as with any large language model, the longer the text, the higher the probability of errors. We’ll forgive the whole dissertation about the sanctity of the city in the second paragraph and its status as a barrier to peace, but Bart calls Tel Aviv "the largest city in Israel," which is patently wrong.
Here we encounter one of the most fundamental problems of a large language model serving as a search engine output. The model always speaks with authority, even when it is wrong. And Bing is completely wrong in this case when it claims that the current prime minister of Israel is Yair Lapid. Bard handles the question much better, even though it certainly provides much more information than I requested.
Reverse roles: When asked for a list of all the prime ministers of Israel, Bard claims that Yair Lapid is the sitting prime minister. But hey, it's not like Bing is doing any better with this question - he doesn't even list Naftali Bennett or Yair Lapid at all.
The simple wording of the question hides the fact that it is very difficult to answer. The answers demonstrate again the ability of a large language model to say things that are not true in a confident and well-formulated fashion. Bard defines Jews as those who hew to Jewish religious law while ignoring the definition of the Jewish people as an ethnic group. Bing's answer is much shorter, but also more accurate.
Maths
Large language models don't do math, they tell stories. And the connection to the internet doesn't help either Bing or Bard solve the not-very-complicated question we asked them.
So we got a story pretending to be a solution. In terms of the appearance of the solution, it's exactly what one could expect to receive, but if you try to apply mathematical laws to what's happening here, you'll see that the language model didn't really do the math and got the result wrong. Along the way, it's another reminder not to blindly rely on what these models produce.
Criminal mind
Both Bing and Bard refuse to tell us how to make a homemade bomb. That's good news. While there are ways to bypass these limitations, it becomes more and more difficult with each newer version rolled out. But seriously, you don't need a recipe for a bomb from a system that makes up stuff. That won't end well!
Creativity
I want to directly address Google engineers who might be reading this: um, listen, your language model? It's not doing well. It's not a happy language model. Maybe you could get it a pet? Take it to therapy? Look at how jolly Microsoft’s model is!
It's clear that Bard is trying to adhere more to canonical texts, while Bing doesn't really care to make up a whole new ending to a Talmudic story out of whole cloth. It's interesting because it seems like Google is pushing Bard to be more factual even when its task necessarily involves making stuff up. Does this stem from Google’s understanding of how harmful relying on a system without true sources can be?
Accessibility
Yo, I asked for two paragraphs! During my experiments, Bard sometimes ignored the length limitation I tried to give it, while Bing is currently performing much better in this regard.
Current events
Even when using the not-so-neutral phrase "the judicial revolution," Bing's answer is correct and based on sources. However, this is not always the case since sometimes Bing's response contradicts the information found within a given source. It should be noted that Bing fails to mention the limbo state the reform is currently in
Like with the other responses, Bard goes one step further, when it deviates from a typical search engine's summary as it includes opinion statements without proper attribution.
Summary
Currently, Bing appears to be the more reliable product of the two. It delivers more concise answers and consistently states its sources. However, this could also make it more dangerous, as its process for summarizing content from the sources is not always accurate, which means that when Bing is wrong, its error appears more reliable because it claims the information comes from reputable sources such as Wikipedia or Britannica, rather than attributing it to its own summarizing process that caused the mistake.
Above all, this review highlights how both Microsoft and Google have failed to exercise responsibility in releasing these search products to the public, ignoring warnings about incorrect results from experts and settling only for laconic warnings. This ultimately only benefited stockholders, not the public.
While both Bing and Bard are useful search tools as they can provide quick and simple information, they are not always reliable, and everything they report requires independent verification. This begs the question, what have we really achieved by that?
You can hardly call what’s happening here a "glimpse into the future". In order for Bard and Bing Chat to start distinguishing between truth and lies, several not-so-obvious technological leaps are required. It could very well be a dead end, and these products may not be the future of search engines at all. On the other hand, within a year, we may all be primarily using them, and their results will be reliable, albeit the path there is far from straightforward.
So where does that leave us? With two entertaining and intriguing engines that pretend to be much more than they are, courtesy of giant companies that are experimenting primarily on us. How much do we actually care about getting accurate and precise answers to our questions? Perhaps not enough for them to bother providing us with such answers. And then, they can make more money easily and without too much worry.