Get readable xml data with greek characters (UTF8, HTML numeric character reference)

This content has 11 years. Please, read this page keeping its age in your mind.

What is your situation?

Ok, let me tell you mine and probably you will find similarities.

I get input data (strings) from a web service of a Public Greek Organization. The web service gets an xml document.

There are 2 possible formats in xml:

 

1. The xml document is in UTF8 encoding

If the response is in this format it’s fairly easy to handle

Lets say that the response is in:

id responseObject which comes from

- (void)getPath:(NSString *)path parameters:(NSDictionary *)parameters success:(void (^)(AFHTTPRequestOperation *operation, id responseObject))success failure:(void (^)(AFHTTPRequestOperation *operation, NSError *error))failure;

in case that you use AFNetworking.

In order to get the sting in readable greek document we use:

NSString *responseString = [[NSString alloc] initWithData:responseObject encoding:NSUTF8StringEncoding];

That’s it!

 

2. The xml document is HTML numeric character reference of format

&#<number>; e.g. & # x 3 C 2 ; (look at List of XML and HTML character entity references in WikiPedia)

If you are in this situation you have to ask ask for help.

Michael Waterfall has developed MWFeedParser framework (find it here on GitHub).  MWFeedParser is an Objective-C framework for downloading and parsing RSS (1.* and 2.*) and Atom web feeds.  In this framework you can use stringByDecodingHTMLEntities method which works flawlessly.

The only con is that Xcode is throwing ARC errors and warnings on build.  The good news is that the MWFeedParser framework has been forked by rmchaara who has update it with ARC support! You can find it here on GitHub.

So, if you get the xml response inside id responseObject,

just use

NSString *correctResponseString = [responseString stringByDecodingHTMLEntities];

since that you have installed the framework and that

#import &quot;NSString+HTML.h&quot;

at the top of your .m file.