Arabic Flash

The first time I came to Dubai I had to do the inevitable, grueling with bidirectional text with dubious deadlines, at first I was shocked, appalled, and frustrated with flash, here’s a software which has been established for a decade but still couldn’t manage how to parse a proper Arabic yet Arabic is one most spoken language in the world you can imagine how massive the market is.

But after the 4th stages of grief I have finally come to accept flash weaknesses and tried to explore new circumvention.

There is a licensed parser out there but like a cat I’m curious.

As you can see although is not perfect, because it doesn’t support embedded Text (and only for windows) so if you don’t have any Arabic language supported you wont be able to view the proper sentences also it only supports a few major fonts which has Unicode Arabic glyph, such as Tahoma and Times Roman. But despite all the flaws I had finally managed to have dynamic Arabic text directly feeding the Flash TextField.

How the Arabic is parsed, in a nutshell, is simply just inverting the letter order, basically when you add an Arabic text in Flash, it will ignore that as an bidirectional text. To illustrate plainly in English “how are you” in flash will be parsed as “uoy era woh”, below are the core function which I’ve used to reversed the order.

package com.arabicDecoderAS3.core {
	
	/**
	* ...
	* @author Rahmat Hidayat
	*/
	
	import com.arabicDecoderAS3.tools.*;
	import com.arabicDecoderAS3.data.Constant;
	
	public class Decoder {
		
		protected var _count:int = 0;
		
		protected var _currValue:*;
		protected var _prevValue:*;
		
		public function Decoder() 
		{
			
		}
		
		private function restructureString(aTarget:Array):Array
		{
			var aSentence:Array = [];			
						
			for(var i:int = 0; i < aTarget.length; i++)
			{				
				if(ASCIIChecker.isLatin(aTarget[i].charCodeAt(0))) 
				{
					_currValue = (aTarget[i].charCodeAt(0) == 32 && _currValue != -1) ? Constant.SPACE : aTarget[i].charCodeAt(0);
					
					aSentence.unshift(aTarget[i].charCodeAt(0));	
														
					return aSentence;				
				}			
				else
				{
					_currValue 	= -1;
					_count 		= 0;
				}
				
				aSentence.push(aTarget[i].charCodeAt(0));		
				
			}
			
			return aSentence;
		}

		private function rebuildText(aTarget:Array):String
		{
			var sResult:String 	= '';
					
			for(var i:int = 0; i < aTarget.length; i++)
			{						
				for(var j:int = 0; j < aTarget[i].length; j++)
				{			
					if (ASCIIChecker.isLatin(aTarget[i][j])) sResult += String.fromCharCode(Constant.RLM);
					
					(aTarget[i][j] == Constant.SPACE) ? sResult += String.fromCharCode(32) : sResult += String.fromCharCode(aTarget[i][j]);			//Temporary Solution, for what?? 					
				}					
			}	
			
			return sResult;			
		}
		
// EXECUTION
// -------------------------------------------

		/**
		 * 
		 * @param	aTarget: is the splitted string
		 * @return
		 */
		public function getCode(aTarget:Array):String
		{			
			var aTemp:Array 	= [];
			var aResult:Array 	= [];
			var n:int;
				
			while(aTarget.length > 0)
			{												
				aTemp 	= restructureString(aTarget);
				n 		= aTemp.length;									
				
				if (ASCIIChecker.isNumber(_currValue) || ASCIIChecker.isLetter(_currValue) || _currValue == Constant.SPACE )	// SPACE still a hack
				{						
					(_count == 0) ? _count = IndexChecker.getIndex(aResult, _prevValue) : _count++;
				
					if (_count != -1) 
					{
						// This when we reversed the Latin text
						aResult = aResult.splice(0, _count + 1).concat(new Array([ _currValue ])).concat(aResult);	
					}
					else
					{
						aResult.unshift(aTemp);	
					}
					
				}
				else
				{
					aResult.unshift(aTemp);	
				}					
				
				aTarget.splice(0, n);
				
				_prevValue = _currValue;					
			
			}
			
			return rebuildText(aResult);
		}
		
		
	}
	
}

The other significant elements are the ASCIIChecker and English reversing-reversal, Tahoma for example has a complete array of glyphs and the Latin glyphs are addressed in the range of decimal order of 32-255 as for Arabic glyphs it will be addressed in Unicode order, using a simple ASCIIChecker we can segregate the Latin character from the Arabic character.

And the last element is the English reversing-reversal ( it’s a funny name I know ) which is simply reversing the Latin words back to its correct order.

But again without embedding the ultimate solution cannot be achieve and due to my lack of knowledge in Arabic I’ve reached the cul de sac of Arabic Flash, because embedding Arabic is highly possible but the problem is Arabic fonts are encoded differently from Unicode fonts. One example is the AXT font, by dissecting the character code it can be found that it does not employ Unicode to wrap the Arabic glyphs which is why windows fail to translate this as a bidirectional text.

I really want to make this Arabic parser as an open source project, if anybody could help me perfecting this -at least until Adobe unleash a real parser- hop in and join my band wagon.

Leave a Reply

Your email address will not be published. Required fields are marked *