Discover the amazing world of Indonesia


Internationalizing Your Websites

However one looks at it, the internet is anpipe  "|"  character.COLUMN VALUES
amazing  piece  of  technology.  It  allows
exposure to your products and services from---------------------------------------------
people  all  over  the  world.  However,  if--------------------
you're getting international visitors andREGISTRY:  apnic,arin,ripencc,lacnic,iana
you're  not  serving  them  with the correct
COUNTRY_CODE: One of 240 unique 2-character
content then you're potentially losing-outcountry  codes  or  "*"  or
on a large number of sales.We've all seen the
seamless way that large sites such as Google"ZZ"  (unassigned)
seem  to  'know'
ADDRESS_TYPE:  asn,ipv4,ipv6
which territory a surfer is coming from and
then  delivers  content  appropriately.  TheADDRESS: Either the starting IP Address or
AS  Number  or  "*"
truth, however, is that even though there is
some  cleverness  behind  the  approachNUMBER: Number of IPs in range or "1" if
ADDRESS_TYPE  is  "asn"
there's no magical all-knowing code lying in
the  background.  With  some  fairlyDATE: Date IP range or AS Number was added
to  database  or  "*"
simple perl code and some FTP data downloads
you  too  can  replicate  thisRANGE_TYPE: "allocated" -> borrowed;
"assigned"  ->  owned
functionality for your own website.Don't
believe me? Well, in this article I'll showOnly the ipv4 lines are needed, so the
you  exactly  what  youothers  can  be  discarded, as can any lines
need to do to gain this functionality forwith wildcard '*' characters in them. This
your own web pages.The basic process involvedreduces  the  file  size to about 2.4 MB and
is to glean your users' IP address (addresses
of  the  formabout 750 000 lines. Generally I load this
into  a  database  and  use  the  ISO  3166
127.1.1.0) which tells you what server
they're  using  to  access the internet. Thecountry code definitions (search for this on
the  web)  to  build  another  database that
example given on the left represents a
dot-decimal  notation  which  comprises fourconverts the two-letter codes in the file
above  to  full  country  names  (the  only
octets in decimal separated by periods.
Everyone  is  familiar  with  this notation,exception to this is GB which is the ipv4
code  for  the  United  Kingdom  whereas the
however it's not very useful if you need to
find  a  number  within  an  IP  range  (theISO 3166 code is UK). I fix this by
appending 'GB United Kingdom' to the end
repositories provide us with IP rangesof
rather  than  a list of numbers). To convert
the code_to_name database I create. Once all
from decimal-coded octets to a singlethe  data  has  been  loaded  into  a
decimal  number  the  following  process  is
database and the IP numbers have been
used (eg. for an IP 1.2.3.4) 1.2.3.4 = 4 +converted  to  decimal ranges I can start to
(3  *  256)  +  (2 * 256 * 256) + (1 * 256 *
do something useful.As a simple example
256 * 256). The numbers are split andhere's an SQL query that fetches some data
reversed  then  each  is  multiplied  byacross  the  two
256^n with n in the range 0 to 3. Thisdatabases  I've  created:
results  in  a  pure  decimal
select cc.country, cc.code, ip.registry,
number.OK, so you can get your siteip.ip_from,  ip.ip_to from country_codes cc,
visitors' IP addresses and convert them to
decimalip_maps ip where cc.code = ip.code and
ip.code  =  "GL";
notation (which aids comparisons), but how
can  you  convert  these  to  the country ofand  returns  the  following  data:
origin for your visitors? Thankfully for us,
the  actual  allocation  of  IP addresses so-+-----------+------+----------+------------+-
-----------+
called Regional Internet Registries (RIRs)
which  are  RIPE  NCC  (Europe), ARIN (North| country | code | registry | ip_from |
ip_to |
Americ and part of the Caribbean), LACNIC
(Latin  America  and  Caribbean),  APNIC
+-----------+------+----------+------------+-
(Asia Pacific) and AfrINIC (Africa). Each of-----------+
these publish a daily list of all the
registry| GREENLAND | GL | ripencc | 1481834496 |
1481842687  |
data and they make these data available via
anonymous  FTP.  To  find  them  simply| GREENLAND | GL | ripencc | 3266437120 |
3266445311  |
navigate to the ARIN home page (which you
can  find  via  any  search engine) and look
+-----------+------+----------+------------+-
for the list of RIRs on the left-hand-----------+
border. These link you out to the respective
FTP2  rows  in  set  (0.05  sec)
sites. Each is a standard repository whereThis is a very simple query, but the code
the  latest  version  of  the  data  file isgiven  below  shows  you  how  to use PHP to
available as delegated-*-latest so it's easydo  something  useful  with  the  databases:
to  write  an  automated
#first the settings to link to the
script that will download the files on adatabase$user="foo";
daily basis.Once you've grabbed the files
from each datacentre they will all have the$pwd="bar";
format  (eg
$host="my.mysql.host";
from  the  European  data  file):
$database="ipv4_db";#now connect to the
database
ripencc|GR|ipv4|62.1.0.0|65536|20000216|alloc
atedmysql_connect($host,$user,$pwd);
@mysql_select_db($database) or die( "Unable
ripencc|CH|ipv4|62.2.0.0|65536|19981211|allocto select database");$add;#fetch the incoming
atedIP address and perform some checks to make
sure
ripencc|SA|ipv4|62.3.0.0|8192|20000721|alloca#that this is validif
ted(getenv("HTTP_CLIENT_IP")) $add =
getenv("HTTP_CLIENT_IP");
ripencc|SA|ipv4|62.3.32.0|8192|20020109|allocelse if(getenv("HTTP_X_FORWARDED_FOR")) $add
ated=  getenv
("HTTP_X_FORWARDED_FOR");
ripencc|GB|ipv4|62.3.64.0|16384|20010629|allo
catedelse if(getenv("REMOTE_ADDR")) $add =
getenv("REMOTE_ADDR");
ripencc|SE|ipv4|62.3.128.0|8192|20001005|alloelse $add = "UNKNOWN";#now check that the IP
catedaddress is valid$cmp =
strcmp($add,"UNKNOWN");#if the IP address is
validif  ($cmp  !=  0)
ripencc|PL|ipv4|62.3.160.0|8192|20020114|allo
cated{#convert the octets into useful decimal$ip
= sprintf("%u", ip2long($add));#now create
the query. Here we're checking to find which
ripencc|GB|ipv4|62.3.192.0|16384|20030212|allIP
ocated
range the incoming IP#address belongs
to$query = "SELECT cc.country as ctry,
ripencc|FR|ipv4|62.4.0.0|8192|19970513|allocacc.code  as  tlc,  ip.registry
ted
as rgst from country_codes cc,ip_maps ip
As you can see from the example above thewhere cc.code = ip.code and ".$ip." >=
data  format  is  relatively simple and easyip_from
to parse as the data is delimited by theand ".$ip.



1 A B C D E F 84 86 87 88 89 90 91 92 93 94 95 96 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130