[kdewebdev-site] Localisation

jacob coby jcobync at yahoo.com
Wed Apr 28 19:09:01 EDT 2004


--- David Joham <djoham at kdewebdev.org> wrote:
> Can I ask one more question? You have a lot of
> if(file_exists.... calls in 
> your code. Considering it's easlily possible that we
> could have 50-100 
> different localized words on a page (and thus 50-100
> file checks), what kind 
> of performance hit are we talking about here? Is
> this going to kill our 
> scalability or am I missing something?

I'm not fully up to speed yet, but I thought I'd throw
in my 2 cents:

http://www.php.net/manual/en/function.file-exists.php

Pretty much anything that uses fstat/stat/lstat is
cached in PHP.  The first call can be expensive, but
the subsequent calls are basically free.

Thinking that there will be 50-100 localized "words"
per page seems to be underestimating by a long shot.  

I think that whatever localization technique we use
should generate ONE localized file per page, not per
phrase or word.  

I haven't looked at any of the proposed systems yet,
so there may be some duplication with what I write
next.

I think that the default language (english, I'm
assuming) should be built in, and not require any
additional I/O.

A generated php page would do something like:

---
// setup the default dictionary
$__dict[0] = 'blah blah';
$__dict[1] = 'another word';
$__dict[ ...;

// handle localization
figure_out_the_language();
if(file_exists("lang/$thisfile/$lang.php"))
  include("lang/$thisfile/$lang.php");
else
  $lang = "en-US";

// the $lang.php file from above would generate an
// array to overwrite/fill the $__dict[] array above

// display something
print "<li>".$__dict[0]."</li>\n";
print "<li>".$__dict[1]."</li>\n";
print "...;

---

It would require some additional memory, yes.  And it
would require a little longer to parse per page, yes. 
However, I think in the long run, it would generate a
more resilient and more scaleable site assuming:

1. English is the default language
2. Most will read the site in english
3. The time it takes to do a file_exists() and
include() is greater than the time it takes to parse
the english part of the original file.
4. The time it takes to include and parse a file is
acceptable.
5. Content developers are lazy.  They want to see
something with the minimum of fuss.
6. Translators are lazy.  They want to see that they
string they just translated works with the minimum of
fuss and/or missing content.  They want to be able to
translate one or two phrases when they have time, not
an entire page.
7. Developer turn-around.  OSS projects generally
don't keep programmers for very long.  You'll get a
handfull of patches to fix some bug or scratch an
itch, and then they're gone.  K.I.S.S.

It's been my experience that 3,4,5 are true.  4 is
especially true if a PHP accelerator is used.  1,2 are
probably true.  I've never worked with translators, so
I dunno about 6.

As for resiliency, whatever we do should require the
minimum of command-line glue and cron scripts.  It
should generally be a tar/untar situation to move the
entire site.

Anyways, I thought I'd throw some more thoughts out
there, and hopefully we can get _something_ done.

=====
--
-Jacob


	
		
__________________________________
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 


More information about the kdewebdev-site mailing list