| However one looks at it, the internet is an | | | | REGISTRY: apnic,arin,ripencc,lacnic,iana |
| amazing piece of technology. It allows | | | | COUNTRY_CODE: One of 240 unique |
| exposure to your products and services from | | | | 2-character country codes or "*" or |
| people all over the world. However, if | | | | "ZZ" (unassigned) |
| you're getting international visitors and you're not | | | | ADDRESS_TYPE: asn,ipv4,ipv6 |
| serving them with the correct | | | | ADDRESS: Either the starting IP Address or AS |
| content then you're potentially losing-out on a | | | | Number or "*" |
| large number of sales.We've all seen the seamless | | | | NUMBER: Number of IPs in range or "1" if |
| way that large sites such as Google seem to | | | | ADDRESS_TYPE is "asn" |
| 'know' | | | | DATE: Date IP range or AS Number was added |
| which territory a surfer is coming from and then | | | | to database or "*" |
| delivers content appropriately. The | | | | RANGE_TYPE: "allocated" -> borrowed; |
| truth, however, is that even though there is | | | | "assigned" -> owned |
| some cleverness behind the approach | | | | Only the ipv4 lines are needed, so the others can |
| there's no magical all-knowing code lying in the | | | | be discarded, as can any lines |
| background. With some fairly | | | | with wildcard '*' characters in them. This reduces |
| simple perl code and some FTP data downloads | | | | the file size to about 2.4 MB and |
| you too can replicate this | | | | about 750 000 lines. Generally I load this into a |
| functionality for your own website.Don't believe | | | | database and use the ISO 3166 |
| me? Well, in this article I'll show you exactly what | | | | country code definitions (search for this on the |
| you | | | | web) to build another database that |
| need to do to gain this functionality for your own | | | | converts the two-letter codes in the file above |
| web pages.The basic process involved is to glean | | | | to full country names (the only |
| your users' IP address (addresses of the form | | | | exception to this is GB which is the ipv4 code |
| 127.1.1.0) which tells you what server they're | | | | for the United Kingdom whereas the |
| using to access the internet. The | | | | ISO 3166 code is UK). I fix this by appending 'GB |
| example given on the left represents a | | | | United Kingdom' to the end of |
| dot-decimal notation which comprises four | | | | the code_to_name database I create. Once all |
| octets in decimal separated by periods. Everyone | | | | the data has been loaded into a |
| is familiar with this notation, | | | | database and the IP numbers have been |
| however it's not very useful if you need to find | | | | converted to decimal ranges I can start to |
| a number within an IP range (the | | | | do something useful.As a simple example here's |
| repositories provide us with IP ranges rather than | | | | an SQL query that fetches some data across the |
| a list of numbers). To convert | | | | two |
| from decimal-coded octets to a single decimal | | | | databases I've created: |
| number the following process is | | | | select cc.country, cc.code, ip.registry, ip.ip_from, |
| used (eg. for an IP 1.2.3.4) 1.2.3.4 = 4 + (3 * 256) | | | | ip.ip_to from country_codes cc, |
| + (2 * 256 * 256) + (1 * 256 * | | | | ip_maps ip where cc.code = ip.code and ip.code = |
| 256 * 256). The numbers are split and reversed | | | | "GL"; |
| then each is multiplied by | | | | and returns the following data: |
| 256^n with n in the range 0 to 3. This results in | | | | +-----------+------+----------+------------+------------+ |
| a pure decimal | | | | | country | code | registry | ip_from | ip_to | |
| number.OK, so you can get your site visitors' IP | | | | +-----------+------+----------+------------+------------+ |
| addresses and convert them to decimal | | | | | GREENLAND | GL | ripencc | 1481834496 | |
| notation (which aids comparisons), but how can | | | | 1481842687 | |
| you convert these to the country of | | | | | GREENLAND | GL | ripencc | 3266437120 | |
| origin for your visitors? Thankfully for us, the | | | | 3266445311 | |
| actual allocation of IP addresses so- | | | | +-----------+------+----------+------------+------------+ |
| called Regional Internet Registries (RIRs) which | | | | 2 rows in set (0.05 sec) |
| are RIPE NCC (Europe), ARIN (North | | | | This is a very simple query, but the code given |
| Americ and part of the Caribbean), LACNIC | | | | below shows you how to use PHP to |
| (Latin America and Caribbean), APNIC | | | | do something useful with the databases: |
| (Asia Pacific) and AfrINIC (Africa). Each of these | | | | #first the settings to link to the |
| publish a daily list of all the registry | | | | database$user="foo"; |
| data and they make these data available via | | | | $pwd="bar"; |
| anonymous FTP. To find them simply | | | | $host="my.mysql.host"; |
| navigate to the ARIN home page (which you can | | | | $database="ipv4_db";#now connect to the |
| find via any search engine) and look | | | | database |
| for the list of RIRs on the left-hand border. | | | | mysql_connect($host,$user,$pwd); |
| These link you out to the respective FTP | | | | @mysql_select_db($database) or die( "Unable to |
| sites. Each is a standard repository where the | | | | select database");$add;#fetch the incoming IP |
| latest version of the data file is | | | | address and perform some checks to make sure |
| available as delegated-*-latest so it's easy to | | | | #that this is validif (getenv("HTTP_CLIENT_IP")) |
| write an automated | | | | $add = getenv("HTTP_CLIENT_IP"); |
| script that will download the files on a daily | | | | else if(getenv("HTTP_X_FORWARDED_FOR")) |
| basis.Once you've grabbed the files from each | | | | $add = getenv |
| datacentre they will all have the format (eg | | | | ("HTTP_X_FORWARDED_FOR"); |
| from the European data file): | | | | else if(getenv("REMOTE_ADDR")) $add = |
| ripencc|GR|ipv4|62.1.0.0|65536|20000216|allocated | | | | getenv("REMOTE_ADDR"); |
| ripencc|CH|ipv4|62.2.0.0|65536|19981211|allocated | | | | else $add = "UNKNOWN";#now check that the IP |
| ripencc|SA|ipv4|62.3.0.0|8192|20000721|allocated | | | | address is valid$cmp = |
| ripencc|SA|ipv4|62.3.32.0|8192|20020109|allocated | | | | strcmp($add,"UNKNOWN");#if the IP address is |
| ripencc|GB|ipv4|62.3.64.0|16384|20010629|allocated | | | | validif ($cmp != 0) |
| ripencc|SE|ipv4|62.3.128.0|8192|20001005|allocated | | | | {#convert the octets into useful decimal$ip = |
| ripencc|PL|ipv4|62.3.160.0|8192|20020114|allocated | | | | sprintf("%u", ip2long($add));#now create the |
| ripencc|GB|ipv4|62.3.192.0|16384|20030212|allocated | | | | query. Here we're checking to find which IP |
| ripencc|FR|ipv4|62.4.0.0|8192|19970513|allocated | | | | range the incoming IP#address belongs to$query |
| As you can see from the example above the | | | | = "SELECT cc.country as ctry, cc.code as tlc, |
| data format is relatively simple and easy | | | | ip.registry |
| to parse as the data is delimited by the pipe "|" | | | | as rgst from country_codes cc,ip_maps ip |
| character.COLUMN VALUES | | | | where cc.code = ip.code and ".$ip." >= ip_from |
| ----------------------------------------------------------------- | | | | and ".$ip. |