| However one looks at it, the internet is an | | | | pipe "|" character.COLUMN VALUES |
| amazing piece of technology. It allows | | | | |
| | | | |
| exposure to your products and services from | | | | --------------------------------------------- |
| people all over the world. However, if | | | | -------------------- |
| | | | |
| you're getting international visitors and | | | | REGISTRY: apnic,arin,ripencc,lacnic,iana |
| you're not serving them with the correct | | | | |
| | | | COUNTRY_CODE: One of 240 unique 2-character |
| content then you're potentially losing-out | | | | country codes or "*" or |
| on a large number of sales.We've all seen the | | | | |
| seamless way that large sites such as Google | | | | "ZZ" (unassigned) |
| seem to 'know' | | | | |
| | | | ADDRESS_TYPE: asn,ipv4,ipv6 |
| which territory a surfer is coming from and | | | | |
| then delivers content appropriately. The | | | | ADDRESS: Either the starting IP Address or |
| | | | AS Number or "*" |
| truth, however, is that even though there is | | | | |
| some cleverness behind the approach | | | | NUMBER: Number of IPs in range or "1" if |
| | | | ADDRESS_TYPE is "asn" |
| there's no magical all-knowing code lying in | | | | |
| the background. With some fairly | | | | DATE: Date IP range or AS Number was added |
| | | | to database or "*" |
| simple perl code and some FTP data downloads | | | | |
| you too can replicate this | | | | RANGE_TYPE: "allocated" -> borrowed; |
| | | | "assigned" -> owned |
| functionality for your own website.Don't | | | | |
| believe me? Well, in this article I'll show | | | | Only the ipv4 lines are needed, so the |
| you exactly what you | | | | others can be discarded, as can any lines |
| | | | |
| need to do to gain this functionality for | | | | with wildcard '*' characters in them. This |
| your own web pages.The basic process involved | | | | reduces the file size to about 2.4 MB and |
| is to glean your users' IP address (addresses | | | | |
| of the form | | | | about 750 000 lines. Generally I load this |
| | | | into a database and use the ISO 3166 |
| 127.1.1.0) which tells you what server | | | | |
| they're using to access the internet. The | | | | country code definitions (search for this on |
| | | | the web) to build another database that |
| example given on the left represents a | | | | |
| dot-decimal notation which comprises four | | | | converts the two-letter codes in the file |
| | | | above to full country names (the only |
| octets in decimal separated by periods. | | | | |
| Everyone is familiar with this notation, | | | | exception to this is GB which is the ipv4 |
| | | | code for the United Kingdom whereas the |
| however it's not very useful if you need to | | | | |
| find a number within an IP range (the | | | | ISO 3166 code is UK). I fix this by |
| | | | appending 'GB United Kingdom' to the end |
| repositories provide us with IP ranges | | | | of |
| rather than a list of numbers). To convert | | | | |
| | | | the code_to_name database I create. Once all |
| from decimal-coded octets to a single | | | | the data has been loaded into a |
| decimal number the following process is | | | | |
| | | | database and the IP numbers have been |
| used (eg. for an IP 1.2.3.4) 1.2.3.4 = 4 + | | | | converted to decimal ranges I can start to |
| (3 * 256) + (2 * 256 * 256) + (1 * 256 * | | | | |
| | | | do something useful.As a simple example |
| 256 * 256). The numbers are split and | | | | here's an SQL query that fetches some data |
| reversed then each is multiplied by | | | | across the two |
| | | | |
| 256^n with n in the range 0 to 3. This | | | | databases I've created: |
| results in a pure decimal | | | | |
| | | | select cc.country, cc.code, ip.registry, |
| number.OK, so you can get your site | | | | ip.ip_from, ip.ip_to from country_codes cc, |
| visitors' IP addresses and convert them to | | | | |
| decimal | | | | ip_maps ip where cc.code = ip.code and |
| | | | ip.code = "GL"; |
| notation (which aids comparisons), but how | | | | |
| can you convert these to the country of | | | | and returns the following data: |
| | | | |
| origin for your visitors? Thankfully for us, | | | | |
| the actual allocation of IP addresses so- | | | | +-----------+------+----------+------------+- |
| | | | -----------+ |
| called Regional Internet Registries (RIRs) | | | | |
| which are RIPE NCC (Europe), ARIN (North | | | | | country | code | registry | ip_from | |
| | | | ip_to | |
| Americ and part of the Caribbean), LACNIC | | | | |
| (Latin America and Caribbean), APNIC | | | | |
| | | | +-----------+------+----------+------------+- |
| (Asia Pacific) and AfrINIC (Africa). Each of | | | | -----------+ |
| these publish a daily list of all the | | | | |
| registry | | | | | GREENLAND | GL | ripencc | 1481834496 | |
| | | | 1481842687 | |
| data and they make these data available via | | | | |
| anonymous FTP. To find them simply | | | | | GREENLAND | GL | ripencc | 3266437120 | |
| | | | 3266445311 | |
| navigate to the ARIN home page (which you | | | | |
| can find via any search engine) and look | | | | |
| | | | +-----------+------+----------+------------+- |
| for the list of RIRs on the left-hand | | | | -----------+ |
| border. These link you out to the respective | | | | |
| FTP | | | | 2 rows in set (0.05 sec) |
| | | | |
| sites. Each is a standard repository where | | | | This is a very simple query, but the code |
| the latest version of the data file is | | | | given below shows you how to use PHP to |
| | | | |
| available as delegated-*-latest so it's easy | | | | do something useful with the databases: |
| to write an automated | | | | |
| | | | #first the settings to link to the |
| script that will download the files on a | | | | database$user="foo"; |
| daily basis.Once you've grabbed the files | | | | |
| from each datacentre they will all have the | | | | $pwd="bar"; |
| format (eg | | | | |
| | | | $host="my.mysql.host"; |
| from the European data file): | | | | |
| | | | $database="ipv4_db";#now connect to the |
| | | | database |
| ripencc|GR|ipv4|62.1.0.0|65536|20000216|alloc | | | | |
| ated | | | | mysql_connect($host,$user,$pwd); |
| | | | |
| | | | @mysql_select_db($database) or die( "Unable |
| ripencc|CH|ipv4|62.2.0.0|65536|19981211|alloc | | | | to select database");$add;#fetch the incoming |
| ated | | | | IP address and perform some checks to make |
| | | | sure |
| | | | |
| ripencc|SA|ipv4|62.3.0.0|8192|20000721|alloca | | | | #that this is validif |
| ted | | | | (getenv("HTTP_CLIENT_IP")) $add = |
| | | | getenv("HTTP_CLIENT_IP"); |
| | | | |
| ripencc|SA|ipv4|62.3.32.0|8192|20020109|alloc | | | | else if(getenv("HTTP_X_FORWARDED_FOR")) $add |
| ated | | | | = getenv |
| | | | |
| | | | ("HTTP_X_FORWARDED_FOR"); |
| ripencc|GB|ipv4|62.3.64.0|16384|20010629|allo | | | | |
| cated | | | | else if(getenv("REMOTE_ADDR")) $add = |
| | | | getenv("REMOTE_ADDR"); |
| | | | |
| ripencc|SE|ipv4|62.3.128.0|8192|20001005|allo | | | | else $add = "UNKNOWN";#now check that the IP |
| cated | | | | address is valid$cmp = |
| | | | strcmp($add,"UNKNOWN");#if the IP address is |
| | | | validif ($cmp != 0) |
| ripencc|PL|ipv4|62.3.160.0|8192|20020114|allo | | | | |
| cated | | | | {#convert the octets into useful decimal$ip |
| | | | = sprintf("%u", ip2long($add));#now create |
| | | | the query. Here we're checking to find which |
| ripencc|GB|ipv4|62.3.192.0|16384|20030212|all | | | | IP |
| ocated | | | | |
| | | | range the incoming IP#address belongs |
| | | | to$query = "SELECT cc.country as ctry, |
| ripencc|FR|ipv4|62.4.0.0|8192|19970513|alloca | | | | cc.code as tlc, ip.registry |
| ted | | | | |
| | | | as rgst from country_codes cc,ip_maps ip |
| As you can see from the example above the | | | | where cc.code = ip.code and ".$ip." >= |
| data format is relatively simple and easy | | | | ip_from |
| | | | |
| to parse as the data is delimited by the | | | | and ".$ip. |