11:00 am. Side room. IoT.
PETER STEINHAUSER: So, all right, good morning everybody and welcome to the IoT working group session today. I am Peter Stein houseer, a co‑Chair of the IoT working group session, my colleague Peter Wehrle could not attend in person today. I think he is still trying to get online and join by Meetecho, he had some technical issues but he will be with us later I think.
All right. A quick look at the agenda, so we have some very interesting talks today and I am really excited about that. But let's start with a little bit of introduction and housekeeping first.
We start with the RIPE 89 minutes validation, then have a look at the Code of Conduct and also the participation instructions for in person and remote participation.
So I also want to welcome all our remote participants and I am really happy to see that the ranks are filing here in person, which is great.
All right, so the RIPE 89 minutes have been approved. The link is on the slides so, if you want to read it from Prague, please feel free to do so.
Yeah, the Code of Conduct, so we are trying to feel everybody welcome and accepted here, please be polite and treat each other with respect, even if you have conflicting opinions, we can deal with that like educated people and not losing our manners.
For the attendees participating in person, after the session during the QA of each talk, go to the mic, ask your question. Please, before asking your question, say your name and your affiliation so we can record this properly.
For remote participants, here are the instructions for Meetecho. Our colleagues are monitoring the QA part of Meetecho and we'll go to the mic and read the questions so we can answer this properly (Remote participants)
OK. And with that, we come to our first talk, and I welcome Anna Maria Mandalari from the University College and I am really excited to see her here, thank you Anna.
(APPLAUSE.)
ANNA MARIA MANDALORI: Thank you, Peter. Thank you. Today I am going to talk about something that is going on in the IoT community in the last few years, that is about policy certification and verification for cyber security in internet of things devices. You may be familiar with some of the regulations I talk about today.
We have a great agenda, so, first of all, I want to talk about the reasons why regulations an standards are needed nowadays for the internet of things devices. Second, I am going to give you some example on privacy and security issues on the internet of things devices that we found in our lab in University College London. Then I am going to talk about some solutions, how we can automatic testing these devices to to be compliant, some standards that are going on in Europe nowadays, what are the current gaps and conclusion.
So let's start with the problem. Why you should be worried to own an internet of things devices nowadays.
For the ones that attended the RIPE meeting in the last CS, probably you are now familiar with the IoT task force that I direct. We have hundreds of internet of things devices, look at the University College London, speakers, camera, more than 200 devices and we have a copy of the device in our laboratory in the US and what we do ‑‑ what we did since 2019 is collecting all the traffic that these devices are chanting over the internet and we have the capability of interacting with these devices like remotely trigger the activities of these devices to a central more than one central is server to connect directly to the devices. Today I will give give you some example of privacy issues that we found Thor these devices, we have many publications from 2019, one of the reasons for privacy is the reasons why you shouldn't watch televisions.
So the motivation is starting in the past, there were a lot of papers on tracking and advertised services looking at your Smart TV and we wanted to know if this was true and how smart television behave with respect to new technology that nowadays is developed to many, inside many, into many smart TVs. This technology is called autonomous content ‑‑ basically all the televisions that you own because this option is enabled by default on your Smart TV, takes convenient shots of what you are watching for many times per second. So usually you can watch television using different services, Netflix, linear TV, antenna, fast TV for example channel provided by Samsung, external devices or screencasting. What happens is that with this new technology, autonomous content recognition, an autonomous content recognition client is installed by default on your Smart TV. It captures what you are watching and the screenshot is ashed and then sent to the autonomous contempt recognition service, a database of contents 'where they match what you are watching, so they can create a unique profile for you knowing if you are interested in sport, shopping, travel. The reason they set ‑‑ they will be able to serve you the right advertisement in other channels. In the future and also through the smart television itself.
What we wanted to know is how frequently does, automatic content recognition capture snapshots of your viewing activities. And for doing that, we double up our methodology for which we installed two televisions in US and UK from Samsung and LG, we discovered they captured screenshots and videos every 500 milliseconds and audio from LG every milliseconds. We set up an infrastructure where we could capture all the traffic and particularly the traffic that is going too automatic content servers from two different locations and we wanted to know if when you are opting out to automatic content recognition practice, the service is stopped, what are the automatic content recognition that are contacted if they are not same country where you are watching the TV or not and we did this in the UK and US. So we saw that automatic content recognition is used for most of the services that you're using. And the most worrying part is that Smart TV records your screen even when you use the television as a dump device ‑‑ so even when you are connecting your laptop through an HDMI cable, in television they are still taking convenient shots of what you are watching, even if it's private information. Luckily when you have opting out of service, the service is stopped for both, Samsung and LG. At least they are in line with GDPR after that.
We did a comparison across the US and UK and we noticed that for some services, some applications installed in the television, the automatic content recognition is stopped automatically, for example Netflix, but only for UK, not for US.
So then we contacted Netflix and asked what was going on but they did not reply.
Now, this is just an example of how smart devices can use your data, and probably I am sure that none of you knows that this content, this service, is enabled by default on your TV. So as soon as you go home, just go to your privacy policy and disable it. Because it's there by default.
But this was done as just an example of privacy issues but there were many other issues that we found with privacy for smart devices, like, for example, devices camera sending the Mac address and encrypted over the internet, some personal identifiable information, over the internet. We found for example emotion sent to some third party services in overseas countries, we suspect the UK, if you are interested just go to my web page, you will find many of these works in 2019.
Now, privacy is not the only issue, there's another issue of security for the internet of things devices and I want to talk about just one of the latest security issues we found with some smart medical devices, so this here, we have another reason why you shouldn't use a smart medical device.
Nowadays you can wear and buy these devices wearable medical devices everywhere.
You go to Amazon and you buy these glucose sensors, you are scanning your skin, the sensor is connected to your skin and they can be using on open loop meaning they are connected via Bluetooth to your phone or in a closed loop, meaning that these devices can be connected directly, for example, in the case of a CGM, this is like a glucose sensor, directly to the insulin pump so we call this closed loop. In this case, this device is the sensor measures sugar in your blood and it regulates the insulin that will be injected into your body.
OK. So these devices most of them use Bluetooth energy, we set up a mini task board, with some of Bluetooth's abductors, the cost of the second one was like 13 pounds on Amazon, the cost of the first one was like 40 pounds. In any case, normal Bluetooth abductor. We were able to do many attacks unfortunately on these devices, sniffing, man in the middle, where he could do data manipulations, we we could stop the devices from working, we could do reply attacks. So what we did, as you may know, Bluetooth devices have a lot of security issues, when it comes to these issues, some of the issues can be solved by changing the session key every time you are reconnecting with the devices. We discovered that this is not happening for the majority of the devices we tested.
So for example for an oximeter we were able to manipulate the data. For ECG, sensor, we were able to inject a ... report. For the case of like blood pressure device, we could emulate hypertensions, even if those weren't ‑‑ if this wasn't the real data from the devices.
But the most worried one is really the CGM, the glucose sensors. We weren't able to do manipulation of data but we could easily do DOS attacks on the devices; we could stop the device to work. And you understand if we are stopping the device to work and you are in the range, you can be in the range ‑‑ so we did this, the maximum range of Bluetooth low energy, this 100 metre in theory but 25 metres in practice, we were able to stop the sensors to work. This means that if this is connected to an insulin pump directly, you can potentially kill someone.
So this is our external results, we tested many devices and we include the green tick is the attacks that we could successfully perform.
Don't do this at home, this is just an explanation, we did responsibility disclosure to these devices manufactures, some of them again did not reply, some of them reply and said that they will replace the device in future updates but still the devices that we test are in the market unfortunately.
So we need regulations, more regulations on these devices. We really need security by default for these devices. How can we do this?
Luckily for us, we live in country that is full of regulations, they love the regulations in Europe.
So they release and approve the European cyber NC intact. Now, what is the European cyber NC intact? In my personal opinion, at the moment, it's a big mess for IoT manufactures. Why? If you are not familiar with it, this is a regulation that will be enforced in 2027 t there is really minimum time for implementing it for the IoT manufactures. Any internet of things manufacturer that sells in Europe any devices will need to be certified. 90% of the devices can have a self‑assessment, can be self certified, 10% of the devices will need to go through a third party assessment, they need to have like a third party certificator. What is the problem? If you read the 440 pages of the regulation, and you are a technical person and developer, you won't understand one word, you don't know where to start, how can you be compliant, so there is no other way than getting a fine in this case.
Right.
And is the same for third party certificator, it's still not clear how this will be implemented, as a develop, as an IoT manufacturer, where do I need to go in order to get this certification, how much it will cost, it will be super expensive, right.
And then you are also familiar with the wonderful GDPR because probably you read the 40 pages of privacy policy that you have every time you are standing in your devices in your home etc, also GDPR needs to be enforced and we need for these regulation automatic tools and we need to provide something that will help the developers and the IoT manufactures to be compliant with these devices, not only that, the enenforcement agency. Like, for example, in the UK, was information Commission officer, they are just really using guidelines for internet of things devices, for internet of things manufacturer, but for them, in all interaction that they had with them, they said it's complicated to really enforce GDPR. You see this with the smart televisions, right, if they are doing something that's weird or circumventing the regulation, it's very complicated for an enforcement agency that we are talking about only 120 people working there, right, to enforce regulations in 22 billions of devices we have out there nowadays.
So at University College London we implemented a system that allow hopefully could have automatic compliance, there's a lot of work to do on it, but an example can be IoTrim+, in the past I present this ‑‑ it's a system for detecting non‑essential destinations, this is about privacy and non‑essential destination for a device, if a device is contacting X destinations, how many of these X listing are actually use for the device to work. And in the past, this was a semi automatic methodology for which we took the devices in the lab, block all the destination one by one, if blocking the destination the device was still working, this means it's non‑essential for the device to work.
Now what we did, also we had the list of domains that are non‑essential, maybe we thought we could build and machine on the model for the network traffic, so every time for a specific device, we see a new destination, automatically we may be able to detect this is essential or not for the device to work and this worked very well for some of the devices. Or if in the training model, you have while you are training the model you have the data from the devices in the model so this means that the model in this case needs to be device specific so you need a model that will work for the same manufacturer. It doesn't matter if it's, for example, an Amazon Firestick or an Amazon Alexa, but it needs to be trained on the data from the same manufacturer because they have similar behaviour.
Regarding security, you might be familiar with this standard, this was released by AtC a few years ago, this is a sort of requirement that internet of things devices, particularly consumer internet of things devices, needs to follow in order to be secure. So how can we take these regulations, concert the regulation or guidelines or standards, convert them in something that can be measured by just using the natural traffic, defining some active and passive test with the traffic that can be deployed in order to test the devices we expect that requirements and then produce a report or certification with the devices in respect of the requirements listed in this case but this methodology could work potentially for any regulations or any documented in legal language.
So we did these for some of the requirements that are in the standard and example is that one of the requirements is that the devices doesn't need, must not have open ports, that are not used by the device.
So the questions here is it will be easy, right, the methodology will be just do a map scanning, open port scanning, you have a list of open ports, then you check how many ports are used by the device and you are done, you compile the list but there are many problems in this research, for example one of the problems we have is like how much do we need because we want to do this in the IoT gateway, in the router, right, so the question is, for example, for this specific requirement, how much time do we need to observe the behaviour of the device in order to have all the open ports that are used, for example if you have a device that is streaming content like Smart TV, this is completely different from a Smart bulb, a Smart bulb might have one or two ports used, it's using one protocol, smart television streaming content, so you have different requirements for different devices so we are doing this study to understand if actually it's possible to convert this regulations in something that you can use by looking at the natural traffic.
So we saw a big problem here, right, we have from one side internet of things manufactures, from the other side we have regulation, the Cyber Resiliency Act and in some ways, IoT manufactures need to be compliant with these regulations and this is why the European Commission decided to have standards, so when you are in doubt, create another standard.
And this is what happens. So there is a standardisation for the European Commission regarding the cyber resilience see act. Now this standardisation effort is given through the European Commission to choose standard... etc see will be responsible for the technical standards, specification for IoT, Cen‑Cenelec will be for aligning standards and developing harmonised standards, what is the problem with this, it's that there's no time; this standardisation needs to be ready by the end of 2026, for some of the standardisation recast and some of the vertical standards needs to be ready by 2027. In any case, really the time is tight.
And when I like when I was reviewing these standards and efforts that is done by these organisation, I saw there is a big gap here, that the gain is how to link the standards with what is really needed from a developer point of view. Again, you look at the standard requirements and you don't know where to start as an IoT manufacturer. So IoT manufacturer and also small medium enterprise, they need help for this. There's a big gap how to implement this standard. So how can we create a standard ‑‑ and I was thinking this is actually not an offer but a question for you, how can we create a standard based on natural behaviour of the IoT devices?
So if you are interested, we got some funding thanks RIPE for helping with this standardisation effort so the final goal of this is like create open tools for SME, for internet of things manufactures that will help them to be compliant with the Cyber Resiliency Act, tools like for example the ones that I described before, around one of the requirements of this standard, like open ports.
And this will be an OpenSource repository where we can put all the tests that can be done using the network traffic for which then IoT manufactures can demonstrate that this compliance with the Cyber Resiliency Act, it's not an easy job but I guess if we work together, this can be done so please help in this if you are interested, get in touch.
Now what is next, we really want to make.compliance easy, even that your light bulb could do it and in order to do this, we are developing this system, this softwares and these active passive test actually to be able to use them directly on the IoT gateway so we needed to optimise the resources that we have there because you know very well that routers, they have limited resources, few mega ram and CPU is very limited and then we need mitigation so real deployment and valuation and we need to understand the problems of third party certification, we need to work together with policy makers of national cyber security agencies to understand how and each country is going to do the certification and how the certification will work, maybe we need a standard also for to certify the certificators, something I am thinking about and privacy and security label certification, so in the future each device may have this is a scheme that in the US is already happening and but it's not mandatory, it's volunteer, and some of the big tech like Amazon and Google adhere to this. The idea is to have a QR code on the package of the device, the QR code can be scanned from the user and the user may know the scores with respect to particular tests, let's say, on cyber security issues and privacy with the device.
Thank you very much for your attention. Happy to take any questions.
(APPLAUSE.)
PETER STEINHAUSER: Very interesting talk. Thank you so much, Anna. Two quick notes. First of all, I was quite scared about your first part of your presentation. The second one is maybe I know that the chip set vendors for gateways are pushing for AI capabilities and they always need something to market it, so I think the data models you created and this detection might be one of the use cases for this technology.
ANNA MARIA MANDALORI: Yes, probably, we cannot know because once the data is leaving the Smart TV, we don't know what they are doing with the data, but we are implementing a study which we are checking which information is served to us in other platforms if we are watching particular content. So, for example, we are implementing automatic methodology for logging in with the same account in the televisions and in the browser and then check if you are watching sports, you get advertisement about sport, even if you opt out for it. I don't have the results yet but hopefully I will have the results for the next RIPE meeting.
AUDIENCE SPEAKER: Alistair Woodman. So thank you for the talk. They ‑‑ I don't think it's quite as bad as you are articulating at the moment between where the regulation is and the standard setting stuff. I am on weekly calls with both Eclipse Foundation stuff and the open SFF, the OpenSource community has been quite active in this particular area, there is still issues with Cen‑Cenelec because they do stuff behind pay walls but I think they are getting squeezed in that regard. AtC is actually quite active and being relative open so I don't think it's quite as bad as you are articulating. So I would encourage you to come out and join in some of those areas. But it was very ‑‑ going to be very important that the appropriate sets of standards are set there, they are not going to have time to create new standards, they will be selecting existing standards and trying to get them as a profile set ready, so there's work involved. But I don't think ‑‑ they are not going to have the time to create yet more complexity.
ANNA MARIA MANDALORI: This is the problem. It's right what you say, it's quite active but they are only responsible for vertical standards, anything that's not directly... directly related with the digital processing to internet of things devices, they will take care of browsers or anything related to the vertical standards of the cyber Resiliency act, working with them, I am starting to work with them because I am going to help and I think that the standardisation community for the act needs more technical experts. But you are right, I talk to them, they are quite open, on Monday I am going to Brussels in one of the meetings from the AtC working group, I wish I will be more involving in the future but I wanted to hop in the community to discuss this problem and if you want to help, we need more technical people in the standard development, this is what I wanted to say.
AUDIENCE SPEAKER: And I don't disagree with you, but I don't think it helps for people to be turning up at the moment and actually doing stuff, we don't need people running around saying, the world is going to collapse because it isn't; they are going to make things work. The more interesting challenge is whether the whole thing will be a dead letter because we end up having this thing and there isn't any police force paid to enforce any of this stuff. So we have to be very careful about a whole bunch of other types of things, right? So the standards are the least of my personal worries at the moment.
ANNA MARIA MANDALORI: Yes, enenforcement is also an issue as well. ... thank you.
NIALL O'REILLY: Niall O'Reilly, pace maker wearer. Essentially I have two and a half or three questions. The first is: Will a ‑‑ when I see complexity like this, I think can we make a V CP kind of document which will help make it simpler and Alistair mentioned profiles and that's probably the way to go. The other thing is am I, is it likely to be possible to have a container in my tourist router which will do the gateway stuff you were talking about.
I forget what the third question was but I know why to where to find you if you think of it.
ANNA MARIA MANDALORI: I think having a draft on this would be great starting point, a drafting which we can collect all the issues and the gaps that we have with this... how we can solve them as an OpenSource community and the second question is so we have OpenSource part of IoTrim and IoTrim++, you may have software for your router that is open WRT written, if I know you well, and then you can download the software and we will work ‑‑ it will work well, it depends on the router you have, how much ram you have but I guess if you have like a quite, so we tested IoTrim++ in a very powerful router, 256 mega ram but probably none of this is the ones that you find in the market easily in Amazon. So yeah, it's there and you can download it. This is the IoTrim part, the ones that detect the non‑essential destinations from your devices, in the future we want to OpenSource part of the detecting security issues with the devices an hopefully we will do by the next RIPE meeting.
PETER STEINHAUSER: Thank you so much, Anna. And a quick note, open WT already has device fingerprinting, so maybe combining this could be an approach to address this. Yeah. Thank you so much again Anna. Great talk. Thanks.
(APPLAUSE.)
So, our next speaker is June June, I hope I pronounced your name correctly.
YUANYUAN ZHOU: Hello everybody and today I want to present our work the twin guard which is... and my name is am Yuanyuan Zhou. And I am currently a PhD student from University College London supervised by Dr Anna Maria Mandalori and this work is in collaboration with Global Cyber Alliance and Yokahama National University.
Firstly, I want to give you some motivation about our work and the generally is that the modern IoT challenges demand new defences.
And from the figure here we can notice that the IoT devices are widely deployed across critical infrastructure domains and the traditional intrusion detection systems struggle with evolving threats. And also the resource constraints on IoT and edge devices /HR‑PLT the feasibility of heavy weight security solutions.
And another thing is that the limited labelled data in the real world settings makes the supervised detection difficult.
And so the realtime adatabaseive and explainable intrusion detection is urgently needed.
And we did some research about the previous work in some areas and the first one is digital twins in the cyber security and what we found is that this concept is widely applied in the industrial content system so security but really in the wipe based attacks and there's no existing system uses the realtime honey pot data to detect this application layer attacks adaptively. And another thing is that we did some research about this wild web attack analysis and we noticed that the existing attacks often imitate to the specific attack categories and also their previous fingerprinting work mostly focus about the....and the classification. And the certification and we did this and we do the analysis about the intrusions from the wild and there gives a profiling based on the behaviour characteristics and the taxonomy validation and now I want to give you the introduction about our work and so generally it starts from the digital twin framework which mirrors the real attacker behaviour which is captured by the honey pots. And we use a virtual model that learns and adapts over time.
And that many components about this data, there's three parts and they are the structured sequence modelling, machine learning classification an also the semantics profiling and this TwinGuard offers modular lightweight and also extensionable design.
And to make it more clear and we have this three‑layer design and one is from the bottom is a physical layer, which is used for capture real world HTTP attacks based on the honey pot sensors and also the layer and we have the storage about the historical data and we trained this model, one is the tri‑ tree, which is used for past machine and the other is machine learning model, this is used for generally intrusion detection and when the new traffic is arriving, they will realise realtime monitoring and this triggers a... mechanism whjich will cause adaptive ‑‑ a retraining mechanism, on the top we have this intelligence layer which has two components one is... the other is fingerprinting of fast structures, there are two one is a cloud providers and also the user agent.
To make it... and to give you a more clear illustration about each layer and from this physical layer, we have this Honeypot Network and the data acquisition part and the data lasting for 26 days and this is a primary Honeypot Network from proxy pod which is supported by the Global Cyber Alliance and to test the generallisation on the different input and we also have this internal Honeypot Network, this is supported by the Ukraine ham ma national university and they only have 70% of... language of a primary schema in which the real world setting it's not all the data for method in the same way. And for the virtual layer, this layer is used for realtime monitoring and adaptive detection and the first layer from this monitors, it offers an interpretable view of structured request paths by aggregating common behaviour patterns. And to be more specific, firstly we have sensitive word extraction part and we reduce the granularity of the HTTP at this step and when we come up with this structured path representation which is illustrated by method to status and U R I key words and if we have the Harker like this, we gets the results from the... to the status and also the K words from the HTTP headers and from here we start with this path match and there will be match unknown pass, this will adapt to the unknown flag and once this flag is over the threshold, we will trigger the update mechanism.
And the other part about the virtual layer is the machine learning classifiers, this is used for general purpose intrusion detection and we did feature engineering which incorporates basic HTTP attributes and content embedings and encoding and temporal features and classifiers and we used the random forest, and we also tested with...they generally show similar results so we just go for the simple one.
And regarding the adaptive part, we adopt this sliding window mechanism which continuously monitors performance degradation and structural novelty within the HTTP traffic stream and we got a label data from the proxy pod which have classifications of scan, attempt and intrusion control and we defined the stable periods and the one is both classifiers accuracy drop is less than 6% and the unknown paternal rate is under 3% and also for the labeling criteria, firstly we matched the similar labelling mechanism as the proxy pod to our internal pod which are unlabelled which is based on the structure request paths, pay load content and end point semantics. And this also shows like the usage of the tri‑tree, we know that if there is a spike in unknown patterns, occurs without the existing labels, we can check if there is new labelling is needed to maintain the general detection accuracy.
And we did some measurement about the accuracy and unknown rate in dynamics and we measure the data from the window 3 to 6 days and we observe from here that the smaller windows show faster reaction, frequent updates and higher volatility. And also for larger windows, we can observe a stable accuracy, fewer updates and also lower unknown rate and from the results here, we notice that at six days window strikes a balance between the model utility and also the stable performance.
And to measure with this adaptive part and we integrate this internal honeypot, 26th March, and where we noticed there is a surge in unknown sequencies and also accuracy drop is also observed upon this integration. But the general performance of the model is recovered after one retraining cycle and the next is about the intelligence layer, where we do the intrusion labelling and attacker attributes and firstly we have this higher Cal taxonomy structure which have three layers and the first layer is where we have the intrusion general categories and on the second layer and we have the technique to realise this kind of intrusion and this is illustrated how this intrusion is done.
And on the third layer we have the end goal to help you understand why the attacker is doing it.
And we take some attacker behaviour fingerprinting and the feature distributions are visualiseed using histograms and also the kernel density estimation. And we did the plotting this is on the user agent from the last left apart the X acquisition represents different HTTP sessions and the Y acquisition indicates their normaliseed values across the session and on the left part, in order to measure the deviations about different distributions and we calculate this GS divergence and from the results we noticed we have diverse behaviours across different user groups, especially in this intrusion control. And also we noticed they have high divergence observed between the scanner bot, Python Libraries, indicates distinct attacker behaviour.
And we did the same thing to the cloud providers and here we can see from the figure they have generally the similar shapes and which he will straight overall divergence and from the calculation and we notice that for the cloud, it shows slide origins in the intrusion attack but the general minimal is, the general impact is minimal.
And to verify the results from the plotting and the way map acts hierarchical taxonomy back into each group and here is the results from the user agent and where we can notice we have diverse distributions instead of different group and for example the browser and clean tool sessions are concentrated in broad categories like exploit attempts and also web shell up loads and for Python libraries and scanner bots, this demonstrates greater technique diversity, especially in misconfiguration exploits and file inclusion.
And also for the cloud provider, which we verify their similar patterns here, which shows that they have a shared attack focus, they have some minor exploits variations for example for the cloud C, we can notice they have more misconfiguration exploit and for cloud D, we have something more about but generally this confirms the cloud based attacks are likely template the and automated, regardless of the provider. So to make our conclusion and the first layer we have this high accuracy and responsive models which maintains accuracy over 90% during the stable periods.
And we established this dual classifiers along with sequence monitoring to ensure robustness of the performance of our model. And also we realised adaptive retraining triggered by the novelty and where we noticed that there is a strong negative correlation between unknown rate and accuracy. And when the new integration is coming in, there is 42% spike in unknowns and also a 30% accuracy drop but this is mitigated in just one update cycle.
And yes, they did this with deployment with diverse traffic which demonstrates the adaptability across different environments and we did this behaviour intelligence, which reviews that the diverse attack behaviour across user agent types and also we noticed that the cloud based traffic shows consistent patterns which may indicate they have the shared.... and for the future work, and the firstly considering to do more real world deployment and valuation like we can integrate this with the real world traffic, from our... and I want to expand the protocol coverage which is moving beyond the HTTP to include more protocols like SSH, FTP and DNS. And this work is currently doing daily based querying from the data source but in the future it will be able to do the continuous streaming and also it is worth to consider with the lightweight IoT deployment.
And that wraps up my presentation here and this work is from our lab, the based at Universal College London and thank you for Global Cyber Alliance and Yokohama University, if you are interested in our work, you can follow us from the following link or scan the QR code here and thank you for listening and I am happy to take any questions and comments.
(APPLAUSE.)
PETER STEINHAUSER: Thank you so much.
OK so we don't have any questions online, any questions from the audience? Otherwise, a big thanks again, great work and we open the stage for your next speaker.
YUANYUAN ZHOU: Thank you.
(APPLAUSE.)
ABHISHEK MISHRA: Hi everyone. This work was done jointly with inre in France and here. So interesting story, this work started recently in the RIPE Hackathon on DNS that happened and there we took on some problem and it went from investigation to hopefully ICMP /TK‑PS for the couple of days. In this work we are looking at the need of having operational and security based best practices for IoT specifically. So why do we need to it. So we firstly did a state‑of‑the‑art, in the state‑of‑the‑art, we look at multiple startedisation bodies, for example AtC, Anita, European Commission, ISO I EC, ITU and more and the goal was to do a state‑of‑the‑art and check if there are regulations and do they contain and the guidelines, do they contain DNS, yes, so DNS is present in multiple of those, but what about DNS in IoT, looking at intersection, no, so there is a clear gap of any IoT specific DNS regulation framework or standards.
And so basically, but first of all, why do we need it, why do we need it because we will find, we found multiple issues and in this talk I will just show you issues one after another so here we go.
So firstly so one of the first issues that we saw is in the same test bed of Anna where we have in UCL, we have more than 30 devices across diverse sets of categories, so for ranging from very naive ones to sophisticated ones. And we look into what happens in the source Portrane domainisation point of view and we look at the standard deviation and looking into this across devices as you can see, we point devices which are which either almost don't randomise or randomise very poorly as well, so quite unexpected. But yes, we point it. So, for example, a camera, camera 4 here or a plug it fails to adequate randomisation so this was one of the first issues that we found. Another issue. So then we moved on to investigate further, so we investigate transaction IDs in DNS, looking at the transaction IDs, the issues were more serious, we saw devices which don't randomise their transaction IDs which could lead to spoofing and other kinds of attacks and then we also put basically classified into brackets of no or poor randomisation, good versus ones which are excellent, so yes, there are indeed devices which comply but there are a significant chunk of devices, around 25%, which don't. So yes, this was another issue that we found and then we moved further.
So next we tried to see queries, so queries for transaction IDs across devices, a couple of examples here, LG TV on the left and a RoboVac on the right, we look into number of queries that you do and the histogram of number of queries and what we see is they are quite distinct, generally you will see multiple queries range from two to 12, even more, and they are quite distinct which could, which basically could lead to, are easy targets for fingerprinting and then we also need to see the reasons why the queries are erratic and multiple, so we investigate further. And so we already saw that queries are repeated, yes, but how do they look like? So looking across the range of devices, we see that queries are repeated in nature and across devices and some of them query a lot, so this is a 30 minute snapshot that was taken of 30 devices.
So queries, so perhaps they are unnecessary, we will dig deeper, but we saw quite representativeness in the query in short duration.
Then going further, what we tried to see is the compliance part as well, so do private or secure protocols like DOH or DOT is supported by IoT devices or not, so we tried to lay a test bed for that. So basically we configured an unbound resolver, supporting DOH only and then we tried to see how many devices came on board.
And surprising or unsurprisingly, none of them. So we don't see any support for DOH or DOOT currently in IoT devices. But then we saw subsidiary interesting phenomenon and what we saw is on the left you have a number of queries per device normally and versus when you have, when they are put on a DOH support, we saw a huge amplification, an average ten‑fold amplification of queries, due to their resolution failure. So basically that means that devices are quite, IoT devices are quite aggressive in their querying patterns when there's a resolve issue failure.
So we move further. And look into more issues in DNS for IoT, so here we look at, for example, packet sizes, so packet sizes distributions across in the queries across devices, if you look at it, one thing very interesting from a fingerprinting point of view is they are quite diverse, so more is the diversity, the higher the chances to be fingerprint to be unique and some of them are quite robust, so yes, and we also ‑‑ we already showed it in other papers that it is true. So yes, fingerprinting for the query length is possible, so what's the solution, we already saw that DOH is not being present so yes, we need DOH but DOH also we need to have padding with it because of the query, due to that query length. So DOH without padding will still cause fingerprinting issues so we need that.
We investigate further and look into the issues why these queries were being done. So for example we look at the context of traffic in this IoT devices, so naturally you think that they are going for STTP request following their resolution or ‑‑ but it wasn't the case. So what we saw mostly is if these devices are basically having a bunch of ICMP pings before DNS query and then a subsequent... queries were being done so yes, we are investigating more but there's not much pattern and they go on querying and querying and querying and yes, going further, we also looked into the support point of view from IPv6 so quite low, less than 30% of the IoT devices in the test bed did support that.
And yes, and moving from the support DOH, I already said was, /WAEPBL seen being supported but then we saw another interesting phenomenon, which was basically that some of the devices were still communicating and that made us realise why, so we investigate further and when we tried to look into it, most of the, some of the IoT devices have fall back addresses, they fall back and keep querying and that is present even though you have a...assignment, we don't use it being known, you will be going going to fall back from certain devices, so that's what we saw and the good of all of this is to turn them into recommendations as I will discuss next.
And but looking further into issues still, we found that, so next we investigated the TTL values in the replies and looking at the TTL values, they are having quite a range. But irrespective of a device having a local cache or not, they don't abide with it. So IT devices generally don't abide with the TTL values and they query irrespective of what they receive.
And a bit more, we looked into the EDNS zero support and we looked into the EMNDNS traffic, quite large in the devices that we saw which could lead to privacy issues in our massive thread model. And yes. So given those issues, what do we need next. So firstly, I didn't show all the results here, this is a work in progress, so we already have, these are all were all passive attacks, we tried to see the compliance and vulnerabilities, but then we did a bunch of tasks with the active test so ‑‑ in which items showed, but I will be happy to show in future. In active test we did three tasks, we looked at the IoT devices, how resilient they are in respect to malform replies and with respect to the injected replies and we saw some interesting results in their querying and re‑tries, they were quite amplified. Then second class in active, we also morphed the TTL values that you already saw to extreme values so for example what happens if you set to zero TTL or extreme TTL and we saw very interesting phenomenon and queries and re‑tries and lastly we also in active thread model we also tried to see what happens in the case of denial of service and we basically amplified the response, not only in terms of their number but also in terms of their sizes and we saw that devices are indeed affected operationally when you swamp them with numbers and also, interestingly, also when you send a crafted reply with very large content in it, it leads to fragmentation and other operational issues, which I didn't show but we found them.
So converting these multiple comprehensive issues, we want to convert them into guidelines, what do we need. We are discussing with another example on standards, we want a start a draft, an example IETF. I will be happy to have any suggestions for any working groups since I am not well versed in this regard, please reach out to me. And thank you and I am happy to take any questions.
(APPLAUSE.)
AUDIENCE SPEAKER: High, Michael Richardson, I just published RFC 9726, it's about DNS and IoT devices, but it's a little more forward looking than I think where your document, you want to write perhaps more remedial, but it's about one of the ‑‑ it's about don't doing, doing dumb things with your DNS that would make mud files hard to do. So that's a little bit more forward looking than we are at unfortunately.
As to which working group, the IoT ops work working group has been rechartered and your content would be rewelcome and I would be happy to co‑author. I had a question: When you did all the surveying, you said there were some DOH, and if I understood you right, what you said is that there were devices that did not use the DNS that DHCP told them to and always used DOH? Or it was only fall back?
ABHISHEK MISHRA: Just fall back.
AUDIENCE SPEAKER: You are saying all the devices tested used the DNS servers you told them to. That's really fantastic because that's for many people, they do not want their IoT devices, you know, talking elsewhere and they also want to be able to do threat detection and signalling by observing DNS requests so I think in that space, we are in a really good place because they haven't jumped on DOH. And just going to 888, there is some evidence that, for instance, Google Chromecast would keep using 888 even though you blocked it and you know gave them local DNS and whatever and they would keep coming back and trying and you are like no, I don't want you to ever do that and well, you know...
ABHISHEK MISHRA: We saw very few but you are right, it's good in many regards that they don't.
AUDIENCE SPEAKER: Please come to the IoT ops working group and recently rechartered, and this is, you will be most welcome.
ABHISHEK MISHRA: Thanks a lot and I will be reaching out. Thanks.
AUDIENCE SPEAKER: Hello, thank you for the presentation. I am Sam Cheadle for ICANN, very interested in fingerprinting, based on characteristics r DNS query volume on packet size, one thing that wasn't clear to me was whether those were more predictive or more useful feature for IoT devices in comparison to conventional network devices. Is that something you have looked into?
ABHISHEK MISHRA: Yes, so looking into them, we saw that queries in terms of their length of queries and frequency is quite unique in IoT devices, when you do clustering, you can almost get clear clusters into them so this is so ‑‑ that's the first thing you saw. And second thing that IoT, we were very much interested specifically in an operational point of view, how does all these vulnerabilities and all these kind of attacks, mostly in active domain, affect the operational point of view and in the active attacks we did see around 35% or something devices are affected operationally too, so that's something that we were...
SAM CHEADLE: Thank you.
PETER STEINHAUSER: OK, we don't have questions online, thank you so much, great talk. Thank you.
(APPLAUSE.)
Next is Michael. So please go ahead.
MICHAEL RICHARDSON: Hi, this is the clicker, right, hi, I am Michael, I am talking on two topics, OK ‑‑ this microphone is really weird ‑‑ and they are related and but they are different topics.
You prefer I use that one? It's weird that it has two pieces sticking out. I don't know where to speak. There we go, all right. Yeah, that's better, let's do it this way.
OK, I am going to talk a little bit about the device identify forum, the IETF settle potential working group an effort that was called the secure usable internet browser and the IOTops working group, and I would be happy to take as many questions you have. ... the device identity forum is starting tomorrow, a little bit after lunch, I am sorry for the overlap with our meeting but there is much negotiation of the time and that is when the inaugural meeting wound up.
And so I am the Chair. What is it? So the IT ‑ well, I will talk about the IoT security foundation in the next slide. This is an entity that is designed to do as I said about 60% technical marketing, OK. And we'll be going over this tomorrow and I'm happy to share my slides new don't make it there. And the two URLs on there, there's a bunch of press releases there, it's about communication of things and making sure every device has an identity.
Who is the IoT Security Foundation? Founded in 2015 at the annual conference, UK based but worldwide. Dr Mandalari is now on the steering board, I don't know a year ago of something like that. And there's a lot of good stuff there, it is an industry forum, it's not too marketing oriented but it does do at a level of marketing and involvement or regulation so if this was the, if we were comparing it to the RIPE and IETF, they are at a higher and less technical level for sure, do not intend to create standards but on the other hand they have done a lot of security best practices and self evaluations and other things like this that are often closer to legal documents and regulations than they are to standards.
So it's an interesting group to be involved with and it seemed like a good place to do this work.
What's this forum about. So device identity is a cryptograph strong identity that's incalled with the device, they are ideally installed at the manufacturer in the factory, they are ideally an 802...... but there are other notions, they can go into your secure element, to your TPM, your virtual TPM, living in a trusted execution environment or not. As far as I am concerned, if you have a little tiny device that has a, you know, a hundred K of ram or something like this and it's basically sealed, then maybe it's secure enough physically as it is.
You can use them to sign evidence for remote attestation and you can use them and this where I care about it, you can use them for onboarding devices, pretty much every single onboarding mechanism, whether it's Bruce key... all of those depend upon the device coming with an identity in it and over ten years of developing these things, we have received various push back froms from PL Mss and oh it's hard to do, it's impossible and in at least two dayss I have said your devices already come with them. You just didn't know.
Because no one thought it was important enough to tell you.
It was assumed.
So this is what it's about, it's about making sure we are sharing those success stories, with the whole industry. Making it clear that it's easier than they thought. And at some level, you know, at some level just creating a good viseo stencil for certificate, I saw one presentation yesterday or the day before I guess it was where they used the traditional diploma icon for certificate only, it was a trust anchor, not a certificate. And explaining that kind of stuff in a visual way in a common way across industry is actually something that I think we are lacking and one of the reasons why it looks really complicated, particularly to non‑technical managers. They are like, oh what is it all about, no, no, it's not that complicated, you just didn't understand it because the presentation was poor.
As I said, it's required for pretty much any onboarding protocol you have to do, you you have to do that and I hope eventually it will make fingerprinting obsolete because you will have a much stronger definitive notion of what this thing is.
So what are we going to do. Well, probably a bunch of white papers about success stories, some presentations, some common material, you know, as I said, viseo stencils, VSGs for your presentation, some common things so when people go and talk outside of kind of our in group, but facing outwards, that the message we are sending is common and people who see it more than once go yeah, oh I remember that, it looks visually similar, the messaging is similar, OK, you must be talking about the same thing.
Right.
And at some point we'll have to do something with COMS and safety and that sort of stuff. Please sign up, even if you don't make it, please sign up, you will be in the database, you can Unicast me, that's fine or hand me your business card or whatever and we are hoping to have all also just kind of regular as I said success story talks and discussions that maybe we'll be under chapter host roles but I don't know quite how to do that with a recorded online thing, we'll figure that out.
So actually I could stop there and ask you if you have any questions or you can come back at the end as you wish. Anyone racing to the mic? No. Am yes, please. Just do it here, that's the back button, here we go.
AUDIENCE SPEAKER: So I worked a bit on device size entities across BEO and WIfI and Laura and we proposed randomisation techniques and broke them. So when you talk about identities, are we also talking about randomiseed size entities and how secure are those identitys, are they adaptive enough or in what context really.
MICHAEL RICHARDSON: So, I am talking about in this case about an identity that's put in the factory that strongly identifies the device. What kind of type, what type it is so for instance some of the regulatory things that an Maria Mandalari was talking about, it's impossible to know if the device complies to the regulation if you don't actually know what it is. And I am also concerned that manufactures will present one device to the regulator, right, but maybe has more ram, more rom, blah blah blah, and then we'll actually fill the supply chain with a cheaper simpler device that does not do that.
Right. And so it's a bait and switch, I predict it will happen and device identities are one way to deal with that so that end users actually can say what is this device. You said it was going to do this, did it do this, well this device doesn't, right. So that's a separate conversation from what does the device if it talks to the cloud or something, that will be visible outside of the home or enterprise, what identitys it uses that will be visible over things. That maybe needs to be different. Sometimes we call those operational certificates rather than identity certificates and if you do TLS 1.3, clients like mutual authetication, then you can hide that anyway. But there's kind of back and forth. So my opinion is that mostly that outside of keeping the identities of the devices invisible over, say, wi‑fi to an eavesdroper who is not part of the network, my opinion is that everybody is better served by knowing exactly what it is and it's your refridgerator talking to the world and not your TV doing ACR, that's better for all of us than trying to obscure that things. Having made sure that someone sitting on the street can't tell what you have because you have got some well encrypted wi‑fi and you are not using, say, Mac addresses that are predictable which are visible even in encrypted wi‑fi.
So does that answer your question?
AUDIENCE SPEAKER: It's quite clear, thank you.
MICHAEL RICHARDSON: Anybody else?
NIALL O'REILLY: And somebody told me once there's no such thing as a stupid question, so here's an attempt at counter‑example because I am totally ìgnorant to.
MICHAEL RICHARDSON: Are you up to it?
NIALL O'REILLY: I don't know, you can tell me, is YANG relevant here?
MICHAEL RICHARDSON: It's not that stupid a question, unfortunately you failed, try again. So the answer is that if you were going to create IoT devices that were manageable by YANG, I don't think you will see in the residential home space ever, OK, but let's say there are air conditioning systems in this hotel that would be to me reasonably that there would be a YANG module that you could manage it, then the answer is that whatever management system that reaches out to it is probably going to want security and the device is going to want to identify itself correctly. So think about a TLS server search on the device. So the answer is the device identity, the birth certificate is probably wrong for that use. But the birth certificate let's you onboard it into your system which then let's you provide an operational certificate to the device.
So that's how BRSKI, OP C US, CSA matter, they all do that mechanism, they on board the device based on the device and they provide a new enrollment certificate to the device so that now it can be yes, managed over restrictive covenants or per cop or what have you.
RUDIGER VOLK: Kind of following up on this, I find the idea surprising that you assume that there is no management and if there is management well kind of it's long time ago that strange devices like toasters were SM MP managed, I think there are devices that are doing this right now and if there are devices that are managed by SNMP, I would think YANG, OK, that kind of ‑‑ they would have...
MICHAEL RICHARDSON: You are completely correct in that and that's why I caveated it in an enterprise or hotel or something, I would totally expect that, the question is not is YANG useful in the home for the device, the question is does the home have a manager that speaks YANG out there, OK, and the trend is no, OK, so the CSA matter mechanism has their own private information model that they use to manage devices and it's not YANG.
OK. So alas it's their own thing, go talk to Google, blah blah blah. And OPCUA for instance is an industrial protocol and they have their own information model and it goes back to well 1995 with Corba, if you remember that space, yes, that's how I feel about that but they are at least using TLS not and not their own RSA mechanism. OK, I am going to go on. IETF related things, there is a proposal to create a working group called settle, securing access to TLS local resources, it's got a proposed charter, it probably won't have a BoF, it might be live by Madrid and it's, the result of several people kind of reinventing or rediscover the problem of how do I speak HTTPS to a device in my name, it hasn't a public name, it has an an RFC address and my browsers puts up ridiculous crazy warnings because if it did have a certificate, it would be wrong.
So the question is either how do we get certificates into such devices with trust anchors or how do we use something that's not TLS or how do we make browsers that maybe have more interesting properties and which one it is is still open to debate and we had presentations maybe two years ago at RIPE, at this thing about the secure usable internet browser which came, the project came out of the I O TC F and we have been back and forth and we don't have a real, there isn't a real answer at this point, there is a conversation and a tussle and I am inviting you to participate in that, that's what I'm trying to update you about.
As part of that, you should know the IoTOPS working group is rechartered, it can do more protocol work and in particular, device lifecycle, so discovery of software updates which is not in the mandate of the suit working group but also the software update might not be in the suit format but you might prefer to have them all on a server in your network and all 12 of your light bulbs or 12,000 in this hotel don't need to reach out to the internet to get their own copy of the firmware for instance and at the end of their life, you might want to kill them and remove all the credentials on them and be assured it happens before you put it in the dumpster because someone may retrieve the device and discover something you didn't want them.
And it is now the official home for mud work so it's moved from the opt AWG which was a problem and we have an RFC 722 bis and it's much expanded and your comments will be welcome
SPEAKER: Any questions to Michael? Otherwise thanks a lot again for a very entertaining and informative talk, thanks Michael.
(APPLAUSE.)
All right, this brings us to the end of the working group session for today. So as you may know the term of my co‑Chair, Peter Wehrle, expired today and I want to thank him so much for stepping up, two years ago we had a situation that we didn't have a candidate and he was spontaneously jumping in and I am really grateful that he served the working group for those two years. So a big thanks to Peter, he is online. So thanks Peter.
(APPLAUSE.)
So during the process of finding a new co‑Chair on the mailing list, Professor Anna Maria Mandalari volunteered and the members of the working group, yeah, they gave her very strong success. So by the charter of the working group, I am very happy to announce that Professor Anna Maria Mandalari got selected as the new co‑Chair of the IoT Working Group. Thank you so much for volunteering and for your openness to serve the working group. Thank you so much.
(APPLAUSE.)
And then like unfortunately there's one more thing. So I was asked to just remind you the NomCom will hold office hours during the second half of all lunch breaks, from 1.15 to 2pm, and they are in the room... to speak in person to the committee to arrange for several meetings with committee members and you can also share feedback on ICP2 from 1.30 to 2:00 at the Meet and Greet desk.
And with this, I want to thank all of the attendees today in person and online for the session and the big big thanks to our presenters today, it was a really informative and great session and I am grateful for everyone who attended and helped. Thank you so much.
(APPLAUSE.)
(Lunch break)