TECHNICAL REPORT

Grantee
Nicole O'Connor
Project Title Modelling and identifying IP address space fragmentation pressure points
Amount Awarded AUD 24955.76
Dates covered by this report: 2019-10-10 to 2021-11-16
Report submission date 2021-11-16
Economies where project was implemented Australia
Project leader name
Peter Dell
Project Team
Alex Wang [email protected]
Partner organization Murdoch University

Project Summary

Despite exhaustion of the IPv4 address space commencing in 2011, the Internet has largely not transitioned to IPv6, and in fact the rate of IPv6 diffusion has recently begun to slow down (Huston, 2018).  The IPv6 transition is expected to take decades and hence problems stemming from issues the lack of scalability of IPv4 will continue to affect the Internet for many years to come. Indeed, the number of allocated IPv4 address blocks continues to grow; this is enabled due to subdivision of existing allocations into multiple blocks, and is argued to allow un-used or under-utilised address space to be moved to other organisations with greater need.  The amount of address space which could potentially be reallocated in this way is substantial: the volume of routed IPv4 address space is considerably less than the total allocated IPv4 address space (Richter et al., 2015), suggesting that there is a considerable amount of un-used address space which could potentially be transferred to other network operators. This typically involves partitioning existing IPv4 address blocks into smaller pieces and transferring some of those pieces to other operators.  In some cases operators re-number their networks to free up contiguous address space which is subsequently transferred; while this can result in more effective use of address space it also results in BGP routing table growth, one of the major scaling issues facing the Internet today (Gamba et al., 2017).  In other cases network operators could migrate whole networks from public to private address space and deploy NAT before transferring address space elsewhere. Continuing the current practice of dividing address space into ever-smaller allocations while increasingly relying on NAT not only presents challenges for IPv6 diffusion efforts but will increasingly create ‘pressure points’ in economies or regions where allocations are smaller.  Further, it also increases the prevalence of layered NAT (sometimes dubbed ‘double NAT’), which can not only lead to a range of operational problems but which has security implications including the creation of attack points to be targeted by malicious parties and increasing the difficulty of identifying hosts involved in botnet activity (BITAG, 2012).  Nevertheless, there has been nothing to suggest that the practice will end in the foreseeable future. There has been no modelling to identify economies or regions likely to be first affected by such pressure points, or for how long this practice can continue.  This project will develop a statistical model of the process, thus allowing countries at greatest risk to develop mitigation strategies, providing clarity to the Internet community, and providing stakeholders tasked with stimulating IPv6 diffusion with a better understanding of differences between different countries and economies.

Table of contents

Background and Justification

Universities, like other end-user organisations, are completely dependent on the Internet for communications, cloud-based services, and the operation of a wide range of applications.  Any threat to the long-term reliability of Internet services is therefore a threat to universities as much as it is to other organisations.

Continued use of IPv4 and limited IPv6 diffusion will have long-lasting and detrimental effects for Internet stakeholders including RIRs, IXPs, ISPs, and end-user organisations.  Continued reliance on work-arounds like NAT and ALGs will grow as the Internet also grows, and so too will the resulting operational and security challenges.  While this project therefore has relevance to a wide range of organisations, identifying areas that are likely to be acutely affected earlier than others is particularly relevant to the most vulnerable countries and economies.

The fundamental challenge investigated in this project is the ability of the Internet to continue to grow in different regions in the face of allocation of address space which remain unequal, and in some cases vastly unequal, between different countries and economies on a per capita basis.  Failure to address this problem will lead some countries to rely more and more on workarounds such as layered NAT while others will not, constraining innovation and Internet growth in some countries more than others.

In order to identify which countries are at greatest risk, and therefore allow organisations and policy-makers to respond in ways that promote equity and reduce disadvantage to more vulnerable countries, the research objective of this project is to create a model of IPv4 address demand, to enable Internet organisations such as RIRs to forecast demand for IPv4 address space in different economies.

Within this objective are several key research questions:

  • Research Question 1:  Which economies are more likely to experience the challenges due to organisations’ inability to obtain IPv4 address space?  (This question relates to operational stability).
  • Research Question 2:  At what point is it likely that subdividing existing IPv4 address allocations will no longer be practicable?  (This question relates to operational stability).
  • Research Question 3:  To what extent does the subdivision of address blocks contributes to BGP table growth?  (This question relates to operational stability).
  • Research Question 4:  To what extent is IPv4 address shortage correlated with botnet activity?  (This question relates to security).

Answers to these research questions will contribute to the development of the Internet by enabling network operators and other stakeholders in vulnerable countries to be better prepared for the emergence of pressure points, and to proactively develop strategies to mitigate the consequences of those pressure points.

Project Implementation

This project aimed to answer the following questions:

  • Research Question 1:  Which economies are more likely to experience the challenges due to organisations’ inability to obtain IPv4 address space? (RQ1)
  • Research Question 2:  At what point is it likely that subdividing existing IPv4 address allocations will no longer be practicable?  (RQ2)
  • Research Question 3:  To what extent does the subdivision of address blocks contributes to BGP table growth?  (RQ3)
  • Research Question 4:  To what extent is IPv4 address shortage correlated with botnet activity?  (RQ4)

To investigate RQ1 and RQ2 the project team collected IP address allocation data sourced from the five RIRs (APNIC, ARIN, RIPE NCC, AFRINIC and Lacnic) and economic data from the World Bank’s World Development Indicators database. A Research Assistant was hired to collect the data and then combine it into a data set suitable for further analysis. The new data set thus includes IP address allocation data from each country as well as their macroeconomic variables (GDP, population, broadband penetration, telecommunications demand, Internet servers, high-tech imports and exports, etc).

Then the Research Assistant used the dataset to test models using two dependent variables: the increase in IPv4 address blocks and in ASN from 2016 to 2017. Independent variables include the number of IPv4 address blocks allocated, total population, population growth, urban population, international tourism, total GDP and GDP per capita, measures of business density and business ease, foreign direct investment, percentage of firms using banks, labour force, school enrolment, high tech imports, goods and services imports, tech co-op grants, communications percentage of commerce exports, communications imports as a percentage of Balance of Payments, and electricity production.

Regarding the increase of IPv4 address blocks allocated, IPv4 address in 2016 has the largest effect can explain 80% of variance. An increasing labour force and the percent of communications-related exports also have statistically significant effects, however the magnitude of these effects is minimal and can explain only an additional 2% of variance.  The table below shows unstandardized path coefficients.

 The increase of IPv4 addresses
IPv4 address in 2016.044***
The increase of labour force3.781E-5***
comms_pct_commerce_exports_2016.809*

The correlation between the increase of IPv4 addresses and the increase in the number of ASNs is high (0.91).  This effect is significant and can explain additional 12% of variance.  However, the raw number of ASNs is not a significant factor in the growth of IPv4 allocations.

Regarding the increase of ASN, the number of ASNs [ASN_2016], the level of technical cooperation grant funding [tech_coop_grants_2016 ] (i.e. foreign aid and other funding intended to finance the transfer of technical and managerial skills or of technology for the purpose of building up general national capacity), and the rate of change of technical cooperation grant funding [tech_coop_grants_delta] all had significant effects.  The number of ASNs explains 61% of variance, while the other two variables can explain additional 25% of variance.

 The increase of ASN
ASN_2016.19***
tech_coop_grants_2016-1.063E-6*
tech_coop_grants_delta-1.001E-6*

To summarize, in answer to RQ1 and RQ2, we find that the economies with more existing IPv4 address allocations will experience greater demand for further IPv4 address allocations, and are therefore more likely to (eventually) experience IPv4 address demand that cannot easily be met. We found that demand for IPv4 address allocations has a level of inertia that appears relatively stable over time. However, we also find that the rate of subdivision of IPv4 address blocks into smaller allocations is such that this trend can continue for a considerable time into the future. For this reason it is not possible to predict with any level of confidence when it will become difficult or impossible for the phenomenon to continue.

To investigate RQ3 (to what extent does the subdivision of address blocks contributes to BGP table growth), we analysed BGP data (routeviews) between January 2013 to December 2017 collected from various locations worldwide and then run correlation analysis to examine how the length of the BGP table related to the number of IPv4 address allocations.  We introduced lags of 1 – 3 months to the IPv4 allocation data to investigate the potential for a delayed impact of IPv4 allocations on BGP.  Overall, the results show that the correlations are high (range from .698 to .997), regardless of the location in which the BGP data was collected. There are no clear patterns of correlations across different time lags. We conclude that changes in the level of IPv4 allocations quickly lead to changes in the size of the BGP routing table. This indicates that the number of IPv4 address allocations has a strong impact on subsequent growth of BGP tables.  However, effects differ in different regions and are smaller in North America (Washington DC, Palo Alto, and Atlanta), South America (Sao Paulo) and East Africa (Nairobi), but relatively larger in Australia (Perth and Sydney), Southern Africa (Johannesburg), Europe (London), and East Asia (Tokyo).

The table below shows the correlations between BGP table lengths and the number of IPv4 addresses.

CorrelationsWAIX
(Perth)
KIXP
(Nairobi)
SydneyISC
(Palo Alto)
Eqix
(DC)
JINX (Johannesburg)LINX
(London)
Sao PauloDIXIE
(Tokyo)
Telxatl
(Atlanta)
Lag 1 month:.981.721.944.733.698.967.936.782.997.832
Lag 2 months:.982.717.944.740.726.963.936.762.997.827
Lag 3 months:.982.718.943   .743.752.960.937.739.997.824

To summarize, for RQ3, we find that the effect of IPv4 address allocation on BGP table growth is quite strong. Given that the correlation is close to 1 in many places, this relationship is almost linear. We conclude that an IPv4 address allocations can quickly increase the size of BGP routing tables in many locations.

To investigate RQ4, which asked the degree to which IPv4 address shortage is correlated with botnet activity, the project team analysed IPv4 address allocation data in conjunction with data from the Composite Blocking List (CBL).  The CBL contains data about botnets that are suspected of spreading spam, viruses and malware. Here sizes mean approximate total IP addresses assigned.

Analysis using the increase of IPv4 address as the independent variable and the volume of IPv4 address as the control variable showed that the increase of IPv4 address has a positive effect on listing, percentage of total, and IPop, and a negative effect on size.

Table below shows unstandardized path coefficient.

 Listings%total%InfectionsTraffic%TrafficSpams/BotSize(K)Infect%IPopIPop%
IPv4_2016-17.37.000-.00043.52.000.00043.78***-.0002821.40-.000
IPv4_delta759.64**.005*.000423.59.001-.014-146.07***.00083769.27*-.000

We also use the increase in the number of ASNs as the independent variable and the raw number of ASNs as a control variable. The results show that the increase of ASN only has a negative effect on size.

 Listings%total%InfectionsTraffic%TrafficSpams/BotSize(K)Infect%IPopIPop%
ASN_20161.80.000-.000125.93*.000*-.00085.89***-.0007782.47*-.000
ASN_delta513.36.004.000135.86.000-.009-212.55**-.00076963.62-.000

Overall, we find that an increase in the number of IPv4 allocations indeed leads to more listings of infected IPv4 address.

These findings were presented to community members from APNIC; during the discussion at the conclusion of the presentation it became clear that widely-held assumptions about the future of IPv4, and its intended replacement IPv6, are not necessarily valid.  In particular, the assumption that an ‘address-crunch’ in the IPv4 address space is imminent appears doubtful, challenging the urgency of the need for IPv6.  The project team was challenged to reconsider their fundamental assumption that a transition to IPv6 in response to IPv4 address limitations was necessary.

Lastly, the project implementation was disrupted by Covid-related travel restrictions, which prevented the project team from travelling to present the findings above at various conferences.  This led to the project team attempting to ‘pivot’ to allocate funding that would have been used for travel purposes to instead be used in interviewing key industry figures to investigate the assumption that a transition to IPv6 will be necessary.  The key question to be answered here would be, if IPv4 address issues are not a significant challenge for Internet governance, what are the issues that are faced instead?

Project Evaluation

The technical tasks progressed well and without incident.  Approximately 30Gb of data has been imported from various sources and processed to formats suitable for analysis.  The analysis revealed some surprising findings – namely that neither economic growth nor population growth were strongly implicated in the growth of IPv4 allocations.  Subsequent conversations with personnel from APNIC revealed that some fundamental and widely-held assumptions about IPv4, and the role of IPv6 for the future of the Internet, might not be valid.  This would contradict the vast majority of commentary about technical aspects of the Internet, and challenged the project team to revisit these fundamental assumptions.

Attempts to identify relevant contacts in Internet governance circles were made but without success.  One interview was conducted with a consultant who attempted to introduce the research team to contacts within the APNIC region, but without success.  The research team also explored access to reports that are not within the public domain (possibly through NDA), but this was not able to be achieved.

Without access to relevant interview participants and other information the research team has not been able to progress this alternative research direction further.  Critiquing fundamental assumptions about the role of IPv6 for the future remains an objective for future research projects for the project team.

An evaluation of specific tasks within the project is in the table below.

TasksStatus
Import and repurpose RIR dataFinished
Import and repurpose WB dataFinished
Model developmentFinished
Analysis – RQ1Finished
Analysis – RQ2Finished
Import and repurpose BGP dataFinished
Analysis – RQ3Finished
Import and repurpose security dataFinished
Analysis – RQ4Finished
Write final project reportsFinished
Travel to APRICOT and ICIS to present findingsCanceled
IndicatorsBaselineProject activities related to indicatorOutputs and outcomesStatus
How do you measure project progress, linked to the your objectives and the information reported on the Implementation and Dissemination sections of this report.Refers to the initial situation when the projects haven’t started yet, and the results and effects are not visible over the beneficiary population.Refer to how the project has been advancing in achieving the indicator at the moment the report is presented. Please include dates.We understand change is part of implementing a project. It is very important to document the decision making process behind changes that affect project implementation in relation with the proposal that was originally approved.Indicate the dates when the activity was started. Is the activity ongoing or has been completed? If it has been completed add the completion dates.
Import IPv4 data completedNo data importedImport IPv4 data to repository; Transform data to format suitable for analysisIPv4 data populated in data repositoryComplete
Import economic data completedNo data importedImport WB data to repository; Transform data to format suitable for analysisEconomic data populated in data repositoryComplete
Analysis of IPv4 data completedNo analysis completedCreation and testing of statistical models; Identification of the model with the best model fitRefined statistical models have been createdComplete
Import BGP data completedNo data importedImport BGP data to repository; Transform data to format suitable for analysisBGP data populated in data repositoryComplete
Import security data completedNo data importedImport security data to repository; Transform data to format suitable for analysisSecurity data populated in repositoryComplete
Analysis of BGP data completedNo analysis completedIdentify relationships between IPv4 and BGP data Statistical analysis of BGP and IPv4 dataCompleted statistical analysisComplete
Analysis of security data completedNo analysis completedIdentify relationships between IPv4 and security data; Statistical analysis of security and IPv4 dataCompleted statistical analysisComplete

Gender Equality and Inclusion

In addition to gender diversity at the team level expected by Curtin’s Athena SWAN accreditation, the project leads to other benefits for women.  The Internet can empower people and communities, particularly in less developed countries (Wheeler, 2011).  By allowing countries that are more vulnerable to Internet disruption caused by the pressure points investigated in this study, we allow those countries to develop mitigation strategies and thus allow the Internet to benefit those citizens with most to gain.  The Internet is a particular source of empowerment for women by allowing for increased access to information and/or professional development that might otherwise have been accessible only to men, by enabling them to expand their social networks and social capital, and by transforming social and political awareness (Wheeler, 2007) and hence we believe that this study is of particular benefit to women in such countries.

Project Communication Strategy

The outcomes and concrete benefits to the Internet community and to more vulnerable nations in particular have been explained at length in response to questions above.  Here, we describe the communication strategies to ensure that the research benefits are realised.

Due to the outbreak of COVID19, we are not able to disseminate at APRICOT conference. To ensure that the results are communicated to practitioners in the field and to other researchers, we plan to do the following after the final report is approved. We note the importance of ensuring such communications reach a wide range of countries where the conclusions are likely to be more significant.

  • APNIC blog posts to share lessons learned along the way

Recommendations and Use of Findings

We make four recommendations:

  1. Detailed analysis of the IPv4 address allocation data revealed that address space is rarely moved from one region or economy to another.  Where address space is subdivided into multiple allocations, the resulting allocations typically remain within the same region.  We recommend further research to investigate whether there are unnecessary obstacles to the movement of address space from one region to another, and if so, whether policies and practices could be to facilitate greater movement of address space allocations from one region to another.
  2. Analysis of BGP and IPv4 data revealed a clear relationship between the increase in the number of IPv4 allocations the size of the BGP routing table.  The strength of this relationship varies from different locations around the globe, but was present at all of the 10 locations tested.  We also note that as at 2021, the size of the BGP routing table continues to grow and is an increasing challenge.  We recommend further research to explore the feasibility of ‘defragmenting’ the IPv4 address space.  At the same time we recognise the many technical and administrative challenges this poses for many stakeholders; one focus of further research could enumerate these challenges and explore strategies by which they might be overcome.
  3. This project has identified a relationship between the extent to which the IPv4 address space is fragmented, and the volume of botnet activity.  While this project has identified the phenomenon exists, it is not clear why it is so.  Therefore we recommend further exploration to understand how increased fragmentation leads to increased botnet activity.
  4. The project team attempted to pivot to conduct interviews with key industry figures to explore what the key challenge(s) facing internet governance, if IPv4 address space issues are not as challenging as had been assumed at the beginning of the project.  This question remains; the project therefore recommends further research to identify potential issues that are not currently widely acknowledged.

Bibliography

BITAG (Broadband Internet technical Advisory Group), 2012.  “Implications of Large Scale Network Address Translation (NAT)”, Available from https://www.bitag.org/documents/BITAG_TWG_Report-Large_Scale_NAT.pdf (accessed 10 April 2019).

Gamba J, Fontugne R, Pelsser C, Bush R, Aben E, 2017. “BGP Table Fragmentation: What and Who?”, encontres Francophones sur la Conception de Protocoles, l’Évaluation de Performance etl’Expérimentation des Réseaux de Communication [Francophone Meetings on Protocol Design, Performance Evaluation and Experimentation of Communication Networks], May 2017, Quiberon, France.

Huston, G. 2018. “What Drives IPv6 Deployment?”, https://labs.apnic.net/?p=1142.

Richter P, Allman M, Bush R, Paxson V, 2015. “A Primer on IPv4 Scarcity”, http://arxiv.org/abs/1411.2649v3.

Wheeler, D.L., 2007. “Empowerment zones? Women, Internet cafés, and life transformations in Egypt”. Information Technologies and International Development 4(2):89-104.

Wheeler, D.L. 2011. “Does  the  Internet  Empower?  A  Look  at  the  Internet  andInternational Development”, in M. Consalvo & C. Ess (eds), The Handbook of Internet Studies, Wiley-Blackwell, Chichester: UK.