I'm trying to get the contents of a html file (I get it as NSData) and extract some information from it. Now my approach was to convert it to CString like that:
Then I parse that string to get what I need. Now, it works fine when it encounters Latin characters but when there are Cyrilic or Chinese characters for example it won't recognize and display jibrish.... What can be done?
Will converting my NSData to UTF8 help? How can it be done when an HTML file contain chars like " ?
Well, I tried the NSASCIIStringEncoding, but it still won't work when using non english letters. Have you tried loading html with russian text for instance ?
The UTF8 is even worse. It return nil or empty string. I think this is becase the html file has non UTF8 characters.
Ok if no one knows, let me ask something else:
How would you retrieve data from web pages without string parsing? Are there some libraries that deal with that ?
There's the available XML parser (see the SeismicXML example from Apple), but for regular HTML, you'll have to write the parsing bits yourself I'm afraid...
Where in your code does it command that the acquisition of the data runs asynchronously? And is that synonymous with "in another thread"?
/Steve
That method of downloading data is by definition asynchronous. You are executing a connection, and handling each little piece of data as it streams in (yes, on its own thread). The alternative to these three methods would be something like this (not in front of my Mac right now, so this is not tested)..
Code:
-(void)FetchHTMLCode:(NSString *)url{
NSData *data=[NSData dataWithContentsOfURL:[NSURL URLWithString:url]];
NSString* result=[[NSString alloc] initWithData:aData encoding:NSASCIIStringEncoding];
//Do something with result here
[result release];
}
In this case, when you call [NSData dataWithContentsOfUrl...], the synchronous download takes place. Before result can be determined, the download must finish.
Unfortunately, I am unsure of how to handle international letters - I haven't ventured into that venue quite yet... Check out this link..It lists the various NSStringEncoding Types. You might need to experiment a bit.
Well by now I've tried various things but to no avail.
What I did was basically to convert the NSData to a byte array and then I parsed it to go the relevant info and put it in a new byte array. Then I tried to convert the new byte array to a UTF8 string - I got nil
Next I tried to convert the new byte array back to NSData and convert the NSData to NSString - again I got nil.