Data Exfiltration Using Copilot in Edge?

So… I’m on a work machine, which has all sorts of gubbins that IT security folk put on it. But it’s also classed as an “autonomous” machine that I have root access on, which lets me install and build things. I’m running Microsoft’s Edge browser, although I forget where I installed it from. I’m pretty sure this would have been from the Edge download site, rather than the organisation’s software download catalogue.

The Edge browser is running a Copilot sidebar. I imagine this is something I might have opted into originally when CoPilot first became available in Edge.

I have a profile selected, but I’m not sure I’ve signed in. I have no direct recollection of knowingly using my current organisation password to sign in to the Edge browser at least since the last time I changed that password.

I’m looking at an OU authenticated page for an old module on the VLE website. This particular page is quite light on content in the central pane, and there is lots of navigation with meaningful titles. So the balance of main text and navigational text is quite balanced. I ask Copilot to summarise the page, which it does.

[UPDATE]: according the the IT folk, “It’s the enterprise version of Chat GPT, assuming you’re signed in the main advantage of which is it doesn’t store the prompt data, unlike Chat GPT. If signed in you should see this at the top of the panel”:

From the summary, it obviously has access to the page content, the page being an authenticated page.

So how does it do it?

There are several possibilities:

  • it generates the summary using a local model by taking a copy of the page content, passing it to the local model as context, and prompting for a summary; the contents of the web page stay on my machine; see, for example, Running AI Models in the Browser;
  • it looks at the URL of the page I am on, finds that on the public web, and uses that as context either in a local model (which would be a bit silly doing the remote lookup), or using a third party service. But the page I’m looking at is authenticated, so there is no online public version of it;
  • the browser sends the page URL to a remote server along with some of my credentials (e.g. the auth tokens/cookies set when I logged into the web page); the remote server logs in and pretends to be me, looks up the URL, extracts the page content and uses it as context; this would not be ideal…
  • the browser knows I’m me from from my browser profile, and knows I’m me from logging into the web page, so it uses some sort of backend Microsoft federated auth thing to allow the summariser model access to the authed web page. Again, this would not be ideal…
  • the CoPilot sidebar grabs a copy of the page content and sends it to the remote service, where it is used as context for the summariser.

So… how can check what the browser may or may not be phoning home?

Most browsers make a powerful debugging environment available in the form of a Developer Tools or Dev Tools application.

If you want to know anything about a web page, what it’s loading, what it’s storing, what it’s phoning home, this is where to look.

So for example, I can look at the network traffic.

This tells me what the page is doing, but not the CoPilot sidebar.

However, if I right click in the sidebar, I get an option to Inspect.

Which opens up devtools for the extension.

Which shows us that the sidebar talks to Bing…

…and can phone the page contents back there:

We also seem to have content chunked on linebreaks:

If I copy the webcontents, we can trivially find our module content in there:

So… can I do the same in private browsing? Nope… it seems that CoPilot is not available in that view?

How about if I create a new browser profile without any credentials? Using CoPilot to view the authenticated page, I get:

Ah, so… maybe I had accepted that previously when I first tried out CoPilot in Edge months ago.

I let that dialogue time out and disappear without me accepting either way, and ask for a page summary. The dialogue reappears briefly and then I get:

so it seems that my organisation is happy for me to be using CoPilot? If I accept, will the page get a summary?

I get the dialogue, again, asking me to allow Microsoft access to page content. (The timeout without me answering is an implied not yes, not no…)

Let’s quit that profile and start a new one… I’ll visit a public page – a BBC news page for example, and ask CoPilot for a summary. I get the Allow Microsoft to Access Page Content dialogue, and accept. I ask for a page summary, and a copy of the page content is sent to Microsoft. I’m not sure what the BBC web page license conditions are when it comes to me sending a copy of their content to a third party?

Maybe because I am introducing third party content into the chat, the chat disables chat history before going any further. This may or may not have anyhting to do with sharing arbitrary content from a viewed page with Microsoft/Bing.

Note that I seem to have granted Microsoft access to all pages I now view, not just the previous page or the previous domain. This is unlike a cookie notice, which typically applies to a domain.

If I now auth into my organisational site, view a VLE page and ask for a summary, I get one.

I don’t notice the any prompt this time about whether my org has given me access to CoPilot.

However, by creating a new profile, with no contact details or other history, it seems that I can access CoPilot, grant it permission to read page content, log in to an authenticated site, and send it off to whatever chat model CoPilot/Bing is running. Or use a profile where I perhaps once in the past granted CoPilot a universal privilege, and henceforth don’t get any challenges about sharing content from other domains back to Bing.

I guess the next step to watch out for, eg if CoPilot starts allowing me to add third party “agents” to my CoPilot experience will be: can I get Bing to then send the page content to an arbitrary third party service?

PS I wonder… could we have some content in each VLE page in a hidden CSS style that can act as a prompt injection thing when CoPilot uses the page content as context? Add additional guardrails, nudge CoPilot into a particular role, get it to give users a warning about not cheating etc etc?

PPS tinkering a bit more, in a conversation that already included a previous prompt asking to summarise an (authed) page, I can:

  • load a new authed page (I note: this was in the same tab and the result of a click through on a link in that page to a new page on the same domain; I haven’t yet tried to see if changing browser tabs and loading a page in a new tab results in that page being pre-emptively sent to Bing without further prompting in the CoPilot window; or if it will pre-emptively send content from a page on a new domain if I explicitly enter a url to a new / authed domain in a tab in which I had just summarised an unauthed public page);
  • CoPilot will send the content back to Bing before I make any further prompts.

This means that I can visit a page that contains personal information and without entering anything further into the CoPilot dialogue, it will send the contents of the page to Bing. And yes, I do have screenshots.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.