Should I Bing or should I not?

(read time: 6 mins)

In a previous post, I shared my first impressions of using Bing and linked to some narrations of experiences (including some spooky ones). I also mentioned Microsoft’s recent action to limit the behavior of Bing, which has some similarities to the shutdown of Microsoft’s Twitter chatbot, Tay, six years ago:

Microsoft is "deeply sorry" for teen chat bot Tay's racist tweets https://t.co/hLgvfKE0hj pic.twitter.com/X8ytcnboUz
— CNN (@CNN) March 25, 2016

Since Microsoft limited the scope of Bing on February 17th, I have been using both Bing and ChatGPT for different activities and explorations. I have spoken to people who have been waiting for access to Bing and have been using ChatGPT instead. I thought it would be useful for them to know what awaits on the other side, especially when compared to using ChatGPT.

The main limitations that were imposed are:

Maximum 5 prompts per “session” (now changed to 6)
Maximum 60 prompts per day (now changed to 100)
Bing will not answer question regarding itself, how it was built, and actually shuts any related conversation

Why I am currently use chatGPT more, even having access to Bing

Initially, I used Bing when ChatGPT was at capacity, but this has been less of a problem lately, so I consider them both when attempting a task. Bing adds fresh traditional web search results to the context. It takes your prompt and adds some results from traditional Bing search. In contrast, ChatGPT has its fixed body of text as input and often reminds you that it knows nothing after Q4 2021. It’s not clear to me if Bing has a longer window for training or if the recency is given by combining fresh search results.

One significant downside of using Bing is that currently, I haven’t found a way to recover previous chats, which is something that ChatGPT allows. Because of that, when I expand on some of my current use cases in separate posts, most of the examples will be from ChatGPT, even if I would have used Bing often as well (currently, I’d say I am doing 70% ChatGPT and 30% Bing).

The current limit of six interactions per chat for Bing is very annoying, especially since Bing often misses the question that is being asked. One way to counter this would be to create better prompts. It’s becoming rational to do so, as it will save you time and prevent you from exhausting your tries. However, it’s slightly annoying. Additionally, the sign-up process for Bing is still a bit wonky. I have been admitted from the waitlist but can only use it on my (Mac) laptop, not my (Windows) Desktop.

Overall, I rely more on ChatGPT, which has good availability and provides better results for the same prompt. There are no limits per conversation, and the exchanges are preserved, and you can even resume them keeping the context. I only resort to Bing these days when I know that there may be recent events that affect the best answer or when I am not too happy with ChatGPT’s answers and want to try something different.

An important drawback from ChatGPT is that it tends to make up data (‘hallucinate’) more often, and also it’s more of a pushover and allows you to override with wrong information with little pushback. On the flipside, I am able to shape much more the output from ChatGPT, including long outputs and elaborate formats.

I’m not very interested in playing with Bing to understand how it was built or finding its limits and boundaries. Therefore, the fact that it currently avoids this type of conversation is not relevant to me.

This Twitter account, apparently by a senior member of the Bing team, has been posting updates on Bing:

Just increased the Bing limits to 6-60. Also, reports were correct, Search Answers were counting against the daily limits – it was an oversight, working with urgency to have that fixed, as well as relaxing the total limit to a 100.
— Mikhail Parakhin (@MParakhin) February 21, 2023

There is a strong subjective element in my assessments above. For instance, a university professor has been posting a lot about Bing and ChatGPT and prefers Bing strongly. Maybe the use case creates this difference in perception.

This fact (from a member of the Bing AI team) appears to be the reason why I keep finding that Bing’s results are insanely better than ChatGPT.

I am not 100% sure why, but magic happens when Bing brings in internet results. I suggest always asking for a search in your requests. https://t.co/Qbd7UAxAVw
— Ethan Mollick (@emollick) February 23, 2023

Pre-17th of February, I would have agreed that the results where better, not just because of having access to fresher content, but also because of the flow of the interaction. We don’t know for sure, but Gwern has the hypothesis that Bing might have been using GPT-4, an upgraded version of the GPT model (chatGPT uses GPT 3.5).

So what?

In a nutshell, I would recommend to sign up to new Bing, and it’s worth going along with Micrsoft’s gamification tactics. There are ocasions where Bing performs better, and also it’s far from clear what will happen with the general availability of these tools, so it’s good to have several options to use at any point. It’s true that we can also expect new ones (including Google/Bard) and I guess that both personal preferences, combined with the intended use cases, some experience bias and availability (and willingness to pay eventually) will determine which tool you’ll end up relying more on (if any)

NOTE: Most of the content of this post was passed through chatGPT with the following prompt: Can you help me improve the writing here? In particular I want to avoid mispellings, and make the text a bit more readable. Also whenever appropriate more conciseness, but overall I don’t want the ideas to be taken out, just if there are improvements in the writing: <post content>. It really struggled understanding what chatGPT was, that Bing and Google are not just a search engines now

Addendum

As I was using Bing to help me with a new post, I got a great example to illustrate the point above. I went straight to Bing, as I wanted to cover release dates for different LLMs, some of which could have been released during the past year, and this is the interaction:

I asked quite clearly to get the research lab and date of release for each of the models, and it only answered about one, and asked me if I wanted info on the others. Even when I said yes, it only provided info for one of the models, and asked me about the others.

Let’s compare with chatGPT, starting with the same initial question (not visible for the Bing example above):

So it already gives me a seemingly more complete answer. I have to say that I do have more worries of hallucinations with GPT for specific detailed information, vs. Bing, but for now I just checked that GShard actually exists.

Here it is also giving me info from the initial models I already knew about, I guess “each of those models” depends on the context (I was thinking about the last response, but chatGPT took it from all the previous interactions). And here I preferred to have it in cronological order, so I asked it to “sort this table by date in ascending order (older first)”, and this is what I got:

It’s worth noting that it’s including Deepmind’s models as part of Google, although I would have wanted to see them separated. Also for PaLM, I would have said Google, but it may be my own lack of knowledge, and maybe a joint contribution between both. So I asked! (can you clarify when the lab was Deepmind, and not join it with Google? Also, I thought PaLM was a Google model, can you clarify that?), and got the following correction, but no change for Deepmind

I insisted with Deepmind, and I got an obedient change:

And I was far from certain that this was coming from ChatGPT’s training data, so I played a little trick:

And this confirmed that it was just making things up and blindly following my guidance. The question is what pieces of info I should actually trust! Did Carnegie Mellon actually participated in the PaLM paper? Let’s check with Bing!

Yikes! an hallucination from ChatGPT then. I wonder if it got confused by this: https://www.ri.cmu.edu/publications/palm-portable-sensor-augmented-vision-system-for-large-scene-modeling. Indeed, something I realise is that actually the PaLM model was released in 2022, after the knowledge window from chatGPT.

In a nutshell, for accurate information Bing seems to have an edge. In terms of being able to produce controllable and detailed output (with the ability of having multiple iterations if needed), chatGPT has an advantage.

juan M