Thursday, 15th May 2025.
Side room. 9:00 am
Connect working group
WILL VAN GULIK: Hello my friends, welcome this morning, this early morning to our session of your second favourite working group because your first one is obviously NCC services. If you are here for IPv6, that's not the right room, unfortunately, but if you are here for connect, by all means, welcome. We are here I am here with my two amazing co‑chairs, Stavros and Paul and we will take you for a ride in the next hour and a half, we have got less packed agenda than what we have usually but we have got amazing topics. And with that, I was going to ask if anyone had comments on the RIPE 89 minutes and I am going to look at the room and see if anyone has something to say? No, you can't have comments because we didn't upload them actually.
We had a miscommunication, so so we will fix that and i apologise for that part and with that, we can start our session and I will ask to the stage Leo, that is going to make us PeeringDB update.
The floor is yours.
LEO VEGODA: Good morning everyone. So, I have some news from PeeringDB, talk a little bit about our recent elections, feature highlights from last year and what we have coming up. I will keep it relatively brief so there's time for a question if anyone is interested. So we just had an election Izzy and Alex are newly elected to the board and Livio was reelected, I don't think the board has met to decide who is going to be Chair yet but that will happen oh, there's an animation! There you go.
That will happen very soon.
So, we have been working on search and data quality particularly in 2024, we are not taking away non‑Ms search but I do want you to know if you log in, you are going to have to use a multifactor authentification from July, some of the features that we have implemented recently you can choose how your results are displayed and you can get the API query for your web query, just by clicking a button, select the format you want your query, like can you recall or Python or whatever, and (Curli) and you put that in your copy buffer and use it wherever else you want, at the moment this is only for simple search, if people wanted this for advance search, please let me know and there's no reason we couldn't implement it for advance search but we haven't done that yet.
And this is all part of reimplementing search with a have you V2 code, the old V1 search will be going away and we have been doing significant work to make sure that the results that you get are the results that you would hope to get.
We have an upcoming web UI change. Around the start of last year, we tried to do this and we clearly didn't have the right process in place. So we put the new design on a preview site, we didn't get the feedback that we were hoping for so we have now as of yesterday started rolling the new web UI out on beta to PeeringDB volunteers. When we are happy that everybody who is on the inside of PeeringDB is happy with it, we will then push it out to 20% of users, randomly selected. And those 20% of users will have a switch in their profile where they can go and turn it off, go back to the old interface.
When we see that people are relatively happy with the new design, we will then push it out to everyone.
So we have improved our process based on user feedback. We are currently doing some design work for the new planned status and this would allow you to go yes say, I will be in this facility in July and then maybe go and say actually it was going to be July and now it's going to be August and people would be able to do some planning based on what you are planning and there would be design elements in there so that you can go and look at what is real right now, what is planned and the combination of both.
And if you have any ideas features, for instance something like this, please come and let us know, this is a feature that was suggested by a user and we liked it and we are doing it as I mentioned, mandatory two‑factor authentication starts in July, lots of choices. If you currently use the API and you are relying on the basics authentication, that's going away, you will need to use API keys. The multifactor authentification account recovery uses email so it's as reliable as your email and you can use pass keys, time based one time password, UTF hardware tokens and we have redesigned the management interface to make it nice and easy to manage, please go make sure you are using it so that when you go and log it in July, it all works.
We are doing a comparison tool at the moment, this is literally being coded right now. You will be able to go and put in one internet exchange point or one facility, compare it with another, go and see who is there, go and do the analysis quickly using our website, again this is a user suggested feature, if you have suggestions for features, please let us know and all of this is because we are sponsored, yesterday we published the had 990 which is the non‑profit financial disclosure thing, that's on our website, you can go and have a look at it but thank you to our sponsors, we couldn't do this without your sponsorship.
STAVROS KONSTANTARAS: Thank you, Leo, very much, let's see if there's any questions for Leo from the audience. Seems like no, thank you very much, Leo.
(APPLAUSE.)
And now I would like to invite my good friend Marta, Marta has a very nice topics to I can to present to us about building a secure testing environment for an IXP.
Marta, the floor is yours.
MARTA BUROCCHI: Thank you, Stavros.
Good morning everyone, I am Marta and today I would like to share with you how we build our secure test environment for our IXP.
Do you know when you make small change in your configuration in production and everything breaks? This is the reaction of a network engineer, please note that this is a real picture from Namex. The idea comes from a very practical need, we wanted to create a secure test environment isolated as close as possible to the real network where first our members could test their configuration without affecting the real LAN and all we could use this base to test our services or make changes in configurations before deploying them to the production network.
So that was the idea. And I had this old friends from the university and they had developed a powerful framework which is Kathara, it's a network emulator and I was pretty sure it was the right tool to bring our idea to life. To build a represent at that of the Namex peering LAN, fully dedicated to testing. And during this presentation, we will explore what the digital twin is and how it works.
Before starting with a serious discussion I would like to share with you a fun fact, when I was preparing this presentation, I discovered that if you search on Google Kathara and you use switch to images, you will find it's a beautiful Greek beach in Rhodes. I hope it is true because we have a few Greeks in the room and as summer is approaching, it would be nice to talk about this but unfortunately, we are referring to this one, Kathara, it's a lightweight container based OpenSource network em late Tore so don't be confused. And with its own programming interface and I will share my time with Laurence here to talk more about this framework.
What is the Namex digital twin? As the name suggests, the digital twin is indeed a twin of our peering LAN, the Namex peering LAN, reflecting also its services. So what I mean is that this built environment is the exactly the same members connected the same route server running to the same demons and running BGP and BIRD and the same configurations, the same IP dress. How did we do it, by using Kathara of course and Kathara is the natural accumulator for each route server instances, we have a Kathara device running.
The whole set up is it provided on pre‑production environment which is just the quarantine VLAN, let's look in the architecture, the core of the architecture is the digital twin manager, its job is to starts by creating the digital environment for the first time taking some configuration files as input and then it keeps this environment sink synchroniseed with the real network and it talks with Kathara which is the network emulator as I said and Kathara relies on a container engine, we use Docker, as you can see, but for large scale scenario, we can we place it with Kubernates, the result is Namex digital twin, fully operational and dedicated to testing.
To let this environment for realistic, we synchronise with a real network, let's see what kind of information we need. The least of the active peers already connected on the peering LAN, the dump of the actual route servers, ribs, both IPv4 and IPv6 and they are 12 route server configuration. With server configuration, the digital twin manager takes this information and creates and removes peers aconsideringly with the live network, it uses the dump of the ribs to create a BGP configuration of the peers and it reloads the route server configuration, the final output is fully operational copy of the Namex peering LAN on a test environment so we have a Kathara device running peer to for each other, a Kathara device for each route server instance and route servers are running exactly the same configuration as in the real network so they are reannouncing the same prefixes as well.
So what we have created is an isolated and secure environment where our members can connect and test and fix their configuration without affecting the real network. New members can use this digital twin also to test the route server prefix filtering, for example, so they can know in advance if they objects are valid, RPKI is OK, and we can also use this safe place to test our new servers or make configuration en route server configuration or make changes.
So the trick is that as it is faithfully reflecting the peering LAN, we can really test how a change may behave in the real network.
So since we have these environment ready, we thought we could use this sandbox as a starting point to build more tools and use cases on top of it.
Actually there is... came last year where we saw the IXPR presentation, they had proposed portal for the new members and for this portal members could self check the configuration compliance so we thought we could easily integrate the quarantine dashboard and we would see what is, so but it's just one of many applications and use cases we can integrate on this digital twin so it's just a starting point. Indeed we can think about RPKI validation, emulating also an RPKI validate Tore and we can emulate ASPA testing or integrate rose T, which is MANRS compliant.
So what is the quarantine dashboard? It's just a portal, we provide to our new members and what they have to do is just to fill out this form with the simple information and they can self check if their configuration are compliant with our technical rules.
We have two more components to the architecture, the back‑end quarantine which is the component that is running the test behind the scene and with Kathara and digital twin behind and the frond event quarantine which is a user friendly interface provided to members.
So the check we performed are aligned with the B‑corps and with our technical rules of course, and more or less the same as the... are running as well. I would like to point out some of these checks can be run also without the digital twin behind, for example BGP checks relies on it because we can really test the BGP behaviour. Before looking into the checks detail, please note that these checks confirm that the configuration is correct in the very first moment when they connect the first time. Of course the configuration may change so unwanted traffic needs to continue monitoring.
So first we ransom connectivity test it is which check the reachability of the peer and we issueed that proxy ARP is disabled, we want to be sure it's just the traffic intended for... and we switch to BGP checks, we first check if the BGP session between the peer and our route server is established, otherwise we cannot go ahead. And then we check the prefix limit. It is something that we have been discussing lately on the route server workshop, among IXPs so it could be a way to double check if the value we choose is correct in advance.
And also we validate the nexthop, so we are checking if the prefix, the next hop of the prefix coming is the same of the peer. And AS path consistency, we check the AS path that most AS is the same of the peer we are peering with.
We monitor the traffic for one minute and we detect they are sending some unauthorised type traffic types, such as neighbour discovery protocols or internal routing protocols and the last one is about security, we conduct this test to detect if there are open ports for DNS and NTP and so on.
So some numbers about the scalable, currently we are emulating 220 members currently and four route server instances like in the real Namex peering LAN, with this hardware specification. As I say, the idea is that the digital twin is the sign to scale accordingly IXP growth so one consideration may be that for large scale IXPs, maybe research intensive to reproduce the world network on a test environment. But as I said at the beginning, Kathara has already its own distributed solution which is Kubernates.
To conclude we have a secure and isolated and realistic test environment. Our new members can connect and safely test configuration without affecting the real LAN. And we can also use this place to test our services before. The current batch board is one of ‑‑ dashboard is one of many applications we can provide and we will integrate more tools and use cases for our members.
That was our experience and if you want to know how you can build your own digital twin, you can contact me or the developers at this address.
I want to conclude in saying that v6 which is another Italian IXP is building a digital twin as well, so don't miss your chance, thank you four for your attention and I am going to share the stage with Lorenzo, which is one of the Kathara containers, to talk more about this framework. Thank you.
(APPLAUSE.)
LORENZO: So, hello to everyone, I am Lorenzo and I am one of the main maintainers of Kathara, also with Tomas down there on the stage. And so let's first start. What is Kathara? I hope that everyone here already knows Kathara. We presented in many other RIPE meetings and other networking meetings all over the world. So it is a container‑based network emulator and it is an OpenSource project. It's very, very good and it is widely adopted on teaching. Many, many universities around the world uses Kathara, many of them use them to teach networking but also universities use Kathara to teach cyber security. Just to have an enclosed space, a sandbox.
So, today we presented the Namex digital twin and the first question is why do we choose Kathara for the Namex digital twin. First of all because we needed the minimal resource usage, many IXPs are really, really big, even bigger than Namex and so with Kathara, we can start up much more devices on a single physical server. Then because we have the Python APIs, through the Python APIs, we could take the real Namex configuration and configure automatically the network and update automatically the network without needing to restart the whole scenario. And finally because it is scalable, so just like the slide... we can emulate a lot of devices. The scaleability is important, if we reach... we will need to go on a Kubernates cluster.
Who are the main people around Kathara, so the first one is me, Lorenzo, we have Tommaso and Mariano that's listening from home, I hope that down there everything is OK. Here we have all the main contacts about Kathara and you have our QR code to our website if you want to learn more or view the list of universities that are using Kathara and if you want to ask any questions, please use the email address provided to write us.
I am here today because lately we founded the Kathara Development Association. Because we are free, I mean, we work, we work on Kathara on our free time, we want to create something more scalable; also, we want to create something that can expand Kathara and that we can have more people working on Kathara. So we founded a non‑profit organisation in Italy. And the idea, the main goal of this organisation, is to build high‑quality tools for researcher and network developers. Kathara, of course, is the main system, but we also have the digital twin, we also have Rose‑T, for the MANRS compliance, we have other tools used for teaching.
And also for teaching, we provide high‑quality computer networks, education material, we provide the slides, we provide laboratories. I mean, there are some courses around the world that just download our material and they present on the class.
And finally, we want to OpenSource project, I mean all the projectsthat are made with Kathara, we want to take them OpenSource like the digital twin, we want to share them back with the community and we hope that many other project will come.
So, I hope that you liked Kathara and I hope that many of you would like to collaborate with us, maybe proposing new project, maybe founding us, why not and we'd love to hear from you. (Founding? ? ) If you have any questions for me or Marta, we are here.
(APPLAUSE.)
STAVROS KONSTANTARAS: Thank you, any questions? Yes.
AUDIENCE SPEAKER: Not a question but thank you very much for Namex to implement this on the real network with real IPs, not something which runs on RFC .. for testing environments; other competitors still do this and I really appreciate it.
MARTA BUROCCHI: Thank you.
AUDIENCE SPEAKER: Hi, Chris from B kicks, Berlin. What I have not got about the Kathara implementation, is it that Kathara is providing the connected networks to the IXP and the IXP itself is running on hardware? Or is Kathara also simulating what is done on the IXP's hardware?
MARTA BUROCCHI: Kathara is a network emulator, so it is emulating the real network. So what it takes is just the real set up, the real configuration files, but everything is emulated, just container running Kathara devices actually and so it is just reflecting the real LAN on our servers. So everything is....
Thank you for the question.
AUDIENCE SPEAKER: So you don't need images, special images, from the hardware devices of the routers?
MARTA BUROCCHI: You mean layer 2? Yeah, actually it's not the scope of the test, we are testing layer 3 and BGP configuration. But we can extend it because so far we didn't have images ‑‑ well developers didn't have images provided. But yeah, we can extend and so far in the layer 2 just a simple switch connecting the Kathara devices.
AUDIENCE SPEAKER: OK, thank you.
MARTA BUROCCHI: You are welcome.
STAVROS KONSTANTARAS: Any from remote participants, any questions? Nobody. Then thank you both for an excellent presentation.
(APPLAUSE.)
And now I'd like to invite Greg, he will tell us a topic that actually I am very interested in, about EVPN migration to an IXP. Greg, the floor is yours.
GREG HANKINS: Good morning, I am Greg from Nokia, I am presenting on behalf of my co‑author, he couldn't be here today, but this is a joint presentation and written somewhat from the perspective of Telehouse America. So who we are? This is not me, this is Telehouse America, established in 1987 and has been running NYIIX, they are owned by the parent company KDDI in Japan, which is a very large mobile operator and has many subsidiaries and data centres and things like that. Specifically about NYIIX, it's a very large exchange, the network is not terribly complex but there are several sites, there's over 245 members, at least two Tbps of traffic and hundreds of thousands of routes, the route server and RPKI is supported so that's nice to hear.
Issues with the previous infrastructure. It was based on outdated VPLS design, when it was designed, VPLS was the hot new thing and latest technology but that was many years ago. And we have seen a lot of IXPs many here in the room that have deployed VPLS and have now migrated to E VPN, this is a trend, I have been talking about EVPN for over ten years and it's good today's people find finally migrating to this technology. The hardware that was deployed was somewhat limited and could only support 100 Gill ports, no 400 or 800 gig. And NYIIX in particular had a long history of issues with broadcast traffic, so that was a real issue for them. As well as some ECMP issues that didn't balance correctly.
So what were they going to do to resolve this issue, support 400 gig ports, flexible configuration as many of you know with QS PDD , you can support 100, gig, 400 gig and 800 in the same form factor, that's a really nice growth model for customers and backbone links to go to higher speed interfaces. They also wanted reliable software and strong technical support, short lead time as the industry has faced many Sem conduct Tore challenges, I am sure you have run into long lead times and waiting months and months for equipment, so they need something within a reasonable timeframe. And also adopting EVPN to manage the broadcast unknown and unicast and multicast traffic and segment routing to simplify the configuration
Operation, they chose Nokia, obviously.
And a little bit about the new platform, it's called Astron, the platform name, they are deploying 400 figure gig ports for future proofing, just on the backbone for now but as many IXPs, I am sure you also have demand for 400 gig customer ports, if not now, I am sure it's coming. As 400 gig adoption gets more popular, obviously EVPN for efficient Mac learn and traffic management and segment routing as the under lay and EVPN overlay provides a nice spatial of protocols and services, it's just really nice to separate that way. And then this is specific to the Nokia product, they are using the new CLI a based on YANG models, it provides the same view in the CLI and over net comp for example, they really night the automation capability of the CLI and the migration was completed at the end of March. And everyone was migrated successfully and with minimal downtime.
So just a little bit about EVPN, we wanted this more to be about migration, there are lots of presentations by IXPs in recent history. IX.BR have been talking about their migration a lot and others have given presentations about their migration EVPN but the focus of those presentations were on the technology and benefits and less on the migration, we tried to make this a little bit different and focus on the knowledge so you know what we are talking about and focus more on the migration which turns out to be surprisingly simple.
EVPN is layer 2, the technology is over ten years old now, it's rich and mature and stable, full feature by now, supported by many vendors, it's a great technology, it has a lot of benefits in particular, in Haugh it manages Mac addresses and flooding, it also supports active active multihoming, and integrate Tore routing so it's actually a layer 2 and layer 3 VPN technology because you can provide multiple services over VPN.
Here's just a quick picture of how the proxy ARP works over EPVN, the top one in purple is normal flooding, so ARP request or ND request is just flooded throughout the network and whoever has that Mac answers and replies back with EVPN, the blue part on the bottom, the flooding actually doesn't take place at layer 2, it's actually encaps laid and then forwarded by each router along the hop to find the Mac address and you can actually optimise it a little bit more by configuring static max in the routers so the flooding doesn't take place and then the leaf router proxies and answers that request instead of a flooding going through the network.
So it's a great way to first of all just reduce the broadcast traffic but also really lock down the flooding so that your routers at the IXP do the flooding and not the customer router.
A little bit about segment routing, over MPLS, it's also not a new technology, it's been around for ten years, EVPN and segment routing have been a big focus or Nokia developments and we tried to move those standards alone, it encodes paths as a series of sequences, a segment identifier called a Sid, and those are just distributed with the IGP and also supported with OSIS, you completely remove signalling protocols and you just use the IGP to forward, to distribute your label routing information.
Here's how the forwarding works. Basically the packet ingress, is just sticks the label of the egress router on the top of the stack and then that's just forwarded along through the network and so in this case the egress node is 600, to puts 600 on the egress packet, goes through the MPLS packet, labels are popped off and out comes the packet.
It's real simple, and again this is just using LSPF as the IGP to distribute that router information.
So on to the testing and migration, these are the slides that I hope are new and a little bit more interesting to you. ContainerLab is a simulation infrastructure that was extensively used for testing, it's really great, it supports probably every writing OS that has a container image, certainly Nokia, Arista, Juniper, Cisco, BIRD, FRR, and the list is endless. But it allows you to load vendor simulators and then build a simulation and containers as the name says. For testing it we ran Nokia SRS containers, their existing was Extreme and Extreme didn't have an OS virtual image. We converted to Nokia SRS far mat and used that for migration, the idea was to create the simulation environment as close to the production environment and then initially considered using MPLS T E but that didn't work because the IXR platform is a broad‑ based platform, I think it wasn't supported on that chip set but it turned out that SR MPLS worked on all the platforms that NYIIX was considering so they chose SR MPLS.
And the migration is surprisingly complex and simple at the same time. The complex part really is building the parallel network. So obviously you have to configure president parallel network correctly with all the routing protocols and user access protocols and things like that but once you have the parallel network operational, all we did is we configured LDP sessions between the extreme, the old extreme network and the new Nokia network and that provideed the interconnection.
So, in the migration really was just pretty simple, so pre‑migration configure the network, add the router to ... create the LDP tunnels and basically prepare to migrate and if you do the pre‑cabling in advance, so you have the new cable right on the cusp and router, the migration step really is just a couple of seconds to pull out the old cable in and stick in a new cable, and there you go, you have been migrated. So that was really the extent of the migration and the surprisingly simple part after a bunch of complex preparation.
Obviously.
So yeah, as the point is that downtime is limited to the duration of the cable swap, probably a couple of seconds.
So two main issues. There's, I guess, a misconfiguration or an issue that they didn't encounter during testing, there was a loop, they figured out to they needed to configure split horizon and there's an odd issue we couldn't replicate actually, but for some reason on the Nokia devices the Mac address was stuck and wasn't learned correctly when the cable was swapped, so the solution just was to manually clear the Mac on the new Nokia device and that solved the problem, it wasn't consistent and we couldn't replicate it, one of the things that just happened and one of the things that you deal with.
The benefits basically what I said at the beginning, stability, scaleability, 400 gig capable, 800 gig capable in the future and the real benefit is the centralised Mac learning and VPN as the overlay and SR MPLS as the under lay and just statistic here, they measured the traffic and immediately after migration, there's already a pretty significant reduction in BUM traffic so.very nice to see that.
And just a couple of next steps. The SR MPLS configuration right now is basic and it has a lot of additional things that IXs considering, possibly using weighted ECMP and PCEP which is a significant challenge protocol could be used to optimise traffic flows as well and there's also and under flying IGP algorithm called FlexAlgo for traffic engineering, depending on their traffic engineering needs and steering needs, they have a lot of options with SR MPLS for future flexibility and whatever comes up, I think one of the benefits that they really like liked about it was the flexibility in things they could do in the future, they don't have to necessarily do now.
And that was pretty much it.
(APPLAUSE.)
WILL VAN GULIK: Thank you very much for that. Do we have any questions in the room? Online, do we have any questions? People are silent, that was like so clear, so simple.
GREG HANKINS: So simple!
WILL VAN GULIK: You should go and see... and you didn't want to share them here online.
So thank you very much for that.
GREG HANKINS: Appreciate it.
(APPLAUSE.)
WILL VAN GULIK: So now we have a talk of Stavros and sorry, I can't remember the name, Matthias, excuse me, some comprehensive analysis about what's happening in our IRR world and with that, the floor is yours.
STAVROS KONSTANTARAS: Thank you, Will. Good morning everyone and for this presentation I invited my good friend Matthias from DE‑CIX, we are going to do it together. We would like to give you kind of a report, as a result of a comprehensive analysis of the IRR landscape, as you can see it's a work combined by AmSix and DE‑CIX, and some other colleagues from DE‑CIX as well have contributed significantly. So let's start with that.
So the reason of this work is because it was inspired by BCP proposal that was introduced back in RIPE 86 in Rotterdam where the BCP was stated that IXP operators must use only the five IRR database in order to configure their route server filters, also including the official delegations of these database. And OK, if you are an IXP, maybe a small to medium size you are already using those, it's fine. You don't have any impact because of these BCP, for bigger IXPs and larger IXPs, that might create some significant changes in the pipeline, in the daily operations, where actually customers need to contribute by transferring objects to third party database and official database, this huge or massive transfer can have a huge impact on the operations. And a grace period of introduced of 12 months in order to allow a smooth transition for those RPSL objects, this list was enhanced by some... database that cover 1% of the global route objects, each of them and these were RADB, RIPE N...
Then after one year and two RIPE meetings, there was a call and discussion in the connect working group mailing list, there was some community reaction because of this and people started coming to the mailing list and expressing their concerns, some of them if there is a... present operators better than third party IRRs, there was another concern regarding OK, does the juice worth the squeeze because yeah, RPKI is taking over? And of course the biggest question was the most complex one was what about the legacy space. If you guys ‑‑ and if it can be transferred, we are ready with this presentation to answer those questions and that's what we are going to do. And in order to do that, we would like to explain the method. And I am handing over the the mic to Matthias.
SPEAKER: Hi everyone, those are very fundamental questions and it escalated into a major data analysis so we can already see it from the amount of sources that we scraped and correlated. So essentially we tried to find every RPSL object on the globe and downloaded and analysed it, we used a lot of BGP data and... RIR specific sources on legacy data, if someone from Ris is in the room, this is a total nightmare, so please work on that and give us a real dump, it's really hard to process and find and in our delegation files and we talked about the data processes or IRRs and also for legacy space.
Let's start with the first question, do authoritative IRRs and by that we mean the rear operated IRRs represent operators better than third party IRRs? And for that you first have to look at where the objects are stored, you see on the X axis you see the different database we analysed and their object share of the global object. On the right you see the authoritative ones, and on left the third party IRRs and if you do the math and split the objects to the left and to the right, you see that route objects are roughly stored equal in both types of databases and for assets, you have already a slight difference preference and ought numb objects are predominantly stored in IRRs, many of the third party IRRs hold much that's than 1% of the global object, even less than ten in total, so those are really interesting candidates.
And yeah, on the left side if you look at TC, that one sticks out in several ways, you will see it a couple of times in the following talk, it's mainly relevant in the LACNIC region in Brazil, they are doing an exceptionally good job in the third party landscape but they are also very small but for assets, they are storing the main part in the lamb Nick region, very interesting finding.
So do networks use multiple IRR database, this plot works we look at where they store their route objectsand on the Y access the third party IRR used and any combination of that is shown here in this plot and 33% of the ASs use 1 to two third IRRs only and 60% use one authoritative IRR and up to two thirds in addition, the use case here is predominant use of the official IRR database but one or two databases in addition for your peers.
As opposed to that, AS sets look quite differently, essentially this is the same plot but you can see here where you store your AS set is more a question for route objects, if you look at the data through the 838% of the ASs use one third‑party IRR and 58% of the ASs use one authoritative IRR and everything else is essentially a no‑use case. Of course makes sense because those are updated frequently.
We also had a look at global or local relevance of IRRs.
To explain this plot on the bottom you have the ‑‑ again, the database name together with the share of the global route objects, the shaded ones are the ones that are holding very few route objects, less than 1% of the global route objects.
And if the database is higher up on this plot, it means they have more local route objects; if they are lower, they have more global route objects and local and global is defined as the region as the head quarter of the respective database.
And you see that the very small IRRs are highly local scope, so there's stuff like, for instance, Canarie, which is operated by the Canadian research, they are only relevant in Canada, the same is true for others, you see the authoritative IRRs that are run by the rears, the orange ones, they are mostly localised, roughly 90% or more local objects and only 10% or less objects outside the same RIR region. If you move to the right you see the largest third party IRRs have the global scope and probably as you would expect, Level 3, RIDB, and RIPE non auth, which is unsurprisingly the most global IRR because of the split between AfriNIC and RIPE. So essentially many objects are out of reach for this database.
How often are objects updated?
Well, for that we looked at the time stamps of the objects and awe how far awe every they are changed, they are handled differently across all of the databases, you have to each each of them how they are using their time stamps and I hope we found all the individual uses there but if you look at this, you have the CDF, the orange ones are the authoritative IRRs and the blue ones are the third party IRRs and what you can see is that route objects in IRRs are updated more regularly, twice as often as others, AS sets look better for third party IRRs, this is mainly caused by TC, an interesting finding, if you remove TC from this plot, the dotted lines align much more and come up with roughly the same result.
Another thing that interested us was do objects match with the default Freezone? And that was one of the most interesting plots I think. So what we have on the X axis is here route objects with origins, conflicting with the D F Z origin, there's an obvious conflict with operational reality and on the Y acquisition you see route objects with no route in the ‑‑ they are not really relevant for the DFZ and if you a database has good performance you are turning up on the bottom left, if you have bad performance you are turning up on the top right, you have a lot of irrelevant or conflicting objects with the DFZ and unlikely winner you see here again, of all databases they perform best and we were so prised by the result and they asked us and they wrote back saying they are enforcing strict rules on their data quality, with maintainers of database want to look for a good example, go and ask them.
The other finding is nearly all third‑party IRRs have worse alignment with the DFZ and the authoritative IRRs are better alignment with the DFZ, here the result is pretty clear, right.
In general, do authoritative IRRs represent operators better than the third party IRRs? I think the answer is yes, they are less in conflict with the DFZ and have a higher object update frequency and the only exception is TC which is highly localised and holds less than 1% of the global route objects. I think that part is a clear win for IRRs. The other question, is the juice worth the squeeze?
So we could also simply wait for RPKI to take over and this is also security question of course, how many injectable objects are there actually left in third‑party IRRs not stored anywhere else. And for that we looked at the data quality of route objects and we categorised them into three different categories per IRR database. And the red ones are on flick can with other information, RPKI invalids, conflicting official IRR information or DFZ announcements and yellow are duplicates or some how irrelevant for the operational reality, they are covered by RPKI without a conflict, that means the same origin in both objects, covered by official IRR records or not seen in the DFZ and the remaining part is what is actually started in third party IRRs, is relevant and is injectable if you store the same route object in a different database.
And one of the findings here ‑‑ and I'm exemplfying this with RADB because it's the same nearly everywhere 1.2 million route objects RADB holds, 84% are conflicting or irrelevant. It probably aligns with the gut feeling of a lot of people.
Still if you look at what is left, there's more than 200,000 relevant vulnerable route objects, even if there is a lot of objects that are not very well maintained and already there, still a pretty relevant in absolute numbers, a pretty relevant amount of route objects that's not stored somewhere else and the share is actually similar for most other third party IRRs, even though RADB is by far the most relevant.
We also looked at how many of these remaining route objects cover DE‑CIX and AmSix routes and in total we identified in our routing tables 36,000 or, between 36,000 and 56,000 vulnerable route objects that are relevant for our routes at our route servers and more than 50% were stored in RADB.
So is the juice worth the squeeze? Yes, I would say. We saw that the data quality in third party IRRs is abysmal, but in total overall IRRs there are more than 230,000 objects not stored anywhere else can be easily hijacked and we find a substantial share of them in our route tables for which they are relevant and that, I am handing over to Stavros for legacy space.
STAVROS: Everyone's favourite topic, legacy space.
All right. Let's have a talk about legacy space. For this question to be answered, we had to do a lot of email exchange with representatives from the IRRs no in order to be as accurate an as possible and confirm that we read and understand.
So yeah, in general legacy space as you all know it's space allocated to organisations before the five RIRs were established, but not all RIRs are established at the same time, right. So we have folks RIPE NCC which is the oldest one, founded in April 1992 and while the youngest is AfriNIC, that was established back in October 2004.
Thus there are different ways of handling the legacy space that each RIR is responsible for.
So, in this table over here, we tried to summarise the results and give you an answer and just one slide, as you can see on the very left we have the five RIRs and then in the second and third column we have the most relevant service we care about which is IRR access and RPKI access and we released what we found how they handled the legacy space.
So as you can see, when it comes to RPKI access, all RIRs actually have a tie breaker, if you don't have a service level agreement with them and a membership agreement with them actually you are denied RPKI access. With exception to LACNIC on that.
And which is not the case for IRR access, well actually as you can see here, you can have IRR access, you can store your legacy objects in official databases and servered by Whois by almost all IRRs with exception to ARIN, ARIN actually blocks you to from IRR access if you don't have a service level agreement with them.
I want to mention here that APNIC is a small exception, not a small exception but it has done fantastic job over here because they don't actually have a legacy space as we say in terms of that APNIC did a huge campaign that lasted for many years and claimed back the legacy space so they moved, they actually approached people and they worked with APNIC members in order to bring the legacy space into the database, into membership agreement and for the space that was not being used and nobody was behind that space actually they claimed it back. And they started redistributing to new members. And in order to do, they did a multi‑year campaign, which resulted to almost 99% of legacy space of APNIC to be recovered.
And it was mentioned here that all RIRs are processes and due diligence when somebody comes and says yeah, this is my legacy space, I want to bring that in under membership agreement so you have actually to prove to the RIR that the space is yours.
So there are procedures and due diligence checks in place.
So after having this overview and understanding what's happening in the legacy space landscape, we had to reconsider our way of thinking, our methodology and we had to to think about in another view because actually it's not, legacy space is not an issue as many people think about it. So as you can see all legacy space except ARIN is actually let's say movable, if it exists in a third party database, it can be moved and served by Whois in an official database, even without a new service agreement, ARIN of course requires a legacy ‑‑ requires a service agreement for legacy space so it's an exception to that. But actually when we say new service agreement, what do we mean? Actually, we mean that you have to convince the RIR that the space is yours and give them proof in order to convince them that it's yours and then they can do the rest of the procedure. And because of that, we ended up having a flow chart, as you can see here, that we actually follow the steps in order to understand better if a space can be moved to an official database or not.
And then we can convert this flow chart into an algorithm in order to continue our research.
So then the question is where does the legacy space live, right? We talk about legacy, where it, as you can see based on our methodology, we ended up with this beautiful graph where on the horizontal axis, you can see the third party database and then on the vertical axis you can see the route objects that exist and then with green we have highlighted the legacy space, we did this work only for ARIN, LACNIC and RIPE data because for AfriNIC we didn't have a valid methodology to confirm it. And as we said for APNIC, that's not an issue any more. And as you can see, 4.9% of RADB objects are related to legacy space and the rest is not. But if we would like to zoom now into this green space to understand better what is happening here, then we are going to end up with this graph over here where again on the horizon al axis we have third party databases and over here we have different levels, different results for the legacy space.
With the yellow colour, we have the possible orphaned objects, actually objects that need new service agreement but actually are in active routes which means we cannot find them in the DNZ, we are not covered by RPKI and they don't have any presence in IRR.
In the light green colour we have the non‑movable objects which are actually ARIN legacy space, which can not be moved to official database because they need a service level agreement as we said.
But there's also significant portion of objects in dark green colour as you can see over there where actually is ARIN space but can be moved to the authoritative database and be served by ARIN's Whois service.
And then the question is ‑‑ can be answered, so the question about legacy space can be answered like, as you can see, three out of four relevant RIRs provide IRR services to non‑member legacy space holders. Only ARIN remains the one that is problematic for now and this problematic space actually represents 3.27% of all global route objects that exist in third parties. These route objects need a new service agreement.
Conclusions. So, question number one, do authoritative IRRs represent operators better than third party IRRs? We say they do. They have more frequent updates and more coherence compared to the DFZ.
And then is the juice worth the squeeze? We say yes. Because more than 80% of low quality objects exist in third party IRRs like RADB, and we found from our research that more than 230,000 injectable and vulnerable route objects exist and which don't want those vulnerabilities to our operations. And what about the legacy space. Of course. ARIN remains problematic as you saw and for other RIRs, non‑member IRR service exists so if you are the source holder in RIPE, for example, of legacy space, then you can still create your out objects in RIPE database.
And we found from our research that 3.27% of ARIN legacy route objects exist in third party and mostly RADB
Outlook for next steps, this research is not done yet. We are going to do the following study, we are going to, we are planning actually together with Matthias and his team to extract the vulnerable prefixes that exist in the AmSix and DE‑CIX route servers and identify the ones that might be lost because of such a policy or BCP and actually map them to organisations, and then of course we are going to try to understand how much traffic goes into those or into those routes and then come up with conclusions and more concrete answers.
And for that, I would like to thank you and I would like to open the mic phone for questions to you.
(APPLAUSE.)
WILL VAN GULIK: Thank you very much for that, that was really, really precise and a lot of data. I am discovering TC and oh, that's OK, interesting one, so do we have any question online? Nothing? People are silent. So OK. Oh. I see people coming.
AUDIENCE SPEAKER: Randy Bush... considering that the majority of legacy space is in the ARIN region measuring traffic at DE‑CIX and AmSix, A.
B, saying the proportion is small when in fact the total amount of space in the ARIN region is far larger than everybody else combined and that's an accident of history, of course, that the US was the initial allocation so ‑‑ but in general, I am a distributed guy, not a centralized guy and here we have the two biggest exchanges in Europe centralising the IRR data and I just, I'm not not strongly opposed, it just goes against my instincts.
STAVROS KONSTANTARAS: I understand your question. We had a discussion about that, but again such a policy or BCP that we are working on affects only IXPs and affects only European region. If we want to do it globally, yes the numbers might shift a little bit but it's in a very focused world I would say and indeed traffic from US, I don't expect much to come over here. To our region.
AUDIENCE SPEAKER: Only affects IXP, it's like it only affects routers and not switches. So much traffic goes in the neighbourhood of the IXs. It's just the T1 peeerings that go around the IXs, everything else goes to the IXs:
SPEAKER: We can also do this analysis probably with US traffic if we find some other... the data.
WILL VAN GULIK: Thank you, next.
AUDIENCE SPEAKER: Fantastic work. Great research. I think the legacy space still remains a problem, 3%, it's a lot. But I think this work, it could benefit in other areas, I think first of all it can incentivise people to move to RPKI, I am not talking about legacy, I acknowledge that's a problem. We need to think more how to do this and the second one back in 2021 when we created the... and the cloud programme in MANRS, people also thinking about this but we couldn't be as radical and you are suggesting now, at least if you look at the actions for the CDN at cloud, what folks proposed is the sort of order of authentication, we start with Auth IRRs and then go to some other also cutting the long tail of IRRs mirrored by... as you showed on your slide. I think that's a great job and we should continue doing this and I think that can create positive effects in odd areas, not just IXP environment.
WILL VAN GULIK: Thank you for that.
AUDIENCE SPEAKER: Angela Delara, policy officer in RIPE NCC. I wanted to connect your presentation to what has been signalled also yesterday in the address policy working group, I also wanted to r specify that in APNIC, this campaign followed up a new policy that was implemented coming from the community of APNIC and in LACNIC at the moment, there is a policy proposal that is aiming at defining what are the requirements of legacy holders to have services. And I want to remind also in our region only legacy holders that have a direct or indirect contract with RIPE NCC are able to use the RPKI service. So I want to invite everybody eventually to review the presentation from registration services yesterday because we are looking for feedback, what can we do. But this has to come from the community. So thank you also for your future feedback on that if you want to let us know. Thank you.
WILL VAN GULIK: Amazing. So it seems you still have some work to do around. Yes. Thank you. So with that, if we don't have any, no comments online? Nothing more? Then thank you very much.
(APPLAUSE.) And so with that, it means that we are ahead of time, which is amazing.
And I also noticed that I did not mention our stenographer and our scribe when I introduced ‑‑ when I opened the sessions, so thank you very much for them.
I would like to remind you all to rate the talks, that really helps us understanding what's going on and what you like in our content and to improve what we will present to you.
I will also remind you that we have got the Programme Committee election and that's coming up, you still have time to vote for that.
That's an important one.
And with that, I think we will be able to leave you to go to the coffee break early, and we will see you all in Bucharest. Thank you!
(APPLAUSE.)
Coffee break.