You are not logged in.

#1 2015-05-30 11:46:50

maggie
Member
Registered: 2011-02-12
Posts: 255

Need to parse xml or json file but am not a programmer

My VPN provider has an api I can use to download the current servers in xml or in json format. I want to parse it and have it show me the top 3 servers based on the current load but after googling linux xml parser tutorial I found the format of the xml file I have is very different than what is expected. Here is a sample.

<?xml version='1.0' standalone='yes'?>
<status result="ok">
  <servers>
    <servers public_name="Acamar" country_name="United States" country_code="us" location="Miami" continent="America" bw="47" bw_max="1000" users="57" currentload="4" />
    <servers public_name="Acrux" country_name="Netherlands" country_code="nl" location="Amsterdam" continent="Europe" bw="25" bw_max="1000" users="20" currentload="2" />
    <servers public_name="Acubens" country_name="Sweden" country_code="se" location="Uppsala" continent="Europe" bw="121" bw_max="1000" users="64" currentload="12" />
    <servers public_name="Alhena" country_name="Canada" country_code="ca" location="Toronto, Ontario" continent="America" bw="269" bw_max="1000" users="84" currentload="26" />
    <servers public_name="Alkaid" country_name="United States" country_code="us" location="Chicago, Illinois" continent="America" bw="210" bw_max="1000" users="88" currentload="21" />
  </servers>
</status>

The tutorials I found expect these things like "public_name" and "bw" to be inside their own entries like <public_name>Acamar</public_name>. Can someone recommend a good tool that doesn't need me to know perl or python I can use to: pull out all servers in Canada with bw_max of 1000 for example?

Last edited by maggie (2015-05-30 11:49:45)

Offline

#2 2015-05-30 12:14:06

Awebb
Member
Registered: 2010-05-06
Posts: 6,285

Re: Need to parse xml or json file but am not a programmer

Or JSON?

http://kmkeen.com/jshon/

EDIT: If you have access to JSON data, use JSON. It was made, because XML is as human readable as ancient Phoenician with bad spelling.

Last edited by Awebb (2015-05-30 12:16:50)

Offline

#3 2015-05-30 12:48:03

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,521
Website

Re: Need to parse xml or json file but am not a programmer

awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2; }' your_file.xml

"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#4 2015-05-30 12:50:27

maggie
Member
Registered: 2011-02-12
Posts: 255

Re: Need to parse xml or json file but am not a programmer

Yes, I can get json but the format looks different than I expected.
http://pastebin.com/gcQrcNAW

Offline

#5 2015-05-30 12:53:57

maggie
Member
Registered: 2011-02-12
Posts: 255

Re: Need to parse xml or json file but am not a programmer

Thank you Trilby. Can awk sort them by a third value "bw" or maybe it is better to use /usr/bin/sort with more output? The bw value is the load and lower is better.

Offline

#6 2015-05-30 13:07:10

Awebb
Member
Registered: 2010-05-06
Posts: 6,285

Re: Need to parse xml or json file but am not a programmer

Looks like plain old JSON to me. Have you tried JSHON?

Offline

#7 2015-05-30 13:14:50

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Need to parse xml or json file but am not a programmer

As teh xml has spaces in the values fields e.g. 'Toronto, Ontario', using awk like this won't work in every situation:

$ awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }'
Alhena 269
$ awk '/bw_max="1000"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }' | sort -nk 2
Alkaid America
Acamar 47
Alhena 269
Acrux 1000
Acubens 1000

but will work if your every Canadian server has location in the "foo, bar" format:

$ awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }' | sort -nk 2
foo1 26
foo3 29
foo4 69
Alhena 269
foo2 269
foo6 629
foo5 9001

Offline

#8 2015-05-31 00:22:29

bulletmark
Member
From: Brisbane, Australia
Registered: 2013-10-22
Posts: 652

Re: Need to parse xml or json file but am not a programmer

Install xmlstarlet from the standard packages and then use something like:

xmlstarlet sel -t -m //servers/servers -v @bw -o " " -v @public_name  -n file.xml  | sort -n

An advantage of this approach is that it will accept any arbitrary formatted xml input (so long as it is valid of course).

Offline

#9 2015-05-31 05:02:45

maggie
Member
Registered: 2011-02-12
Posts: 255

Re: Need to parse xml or json file but am not a programmer

Hooray. Thank you everyone. Karol's code is working.

@bulletmark - Thank you for that usage of xmlstarlet. I cannot figure out how to add in the ability to match the "country_code=ca" into your example though.

Offline

#10 2015-05-31 05:14:17

bulletmark
Member
From: Brisbane, Australia
Registered: 2013-10-22
Posts: 652

Re: Need to parse xml or json file but am not a programmer

@maggie, I could tell you but the xmlstarlet documentation, and the articles it links to, is pretty darn good.

Offline

#11 2015-05-31 05:30:27

progandy
Member
Registered: 2012-05-17
Posts: 5,190

Re: Need to parse xml or json file but am not a programmer

@maggie: That is easily done with XPath expressions, for example:

xmlstarlet sel -t -m '//servers[@country_code="us"]' -c '.' -n vpn.xml
xmlstarlet sel -T -t -m '//servers[@country_code="us"]' -v @public_name -o " " -v @bw -n vpn.xml

xmlstarlet sel -t -m '//servers[@country_code="us"]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml
xmlstarlet sel -t -m '//servers[@country_code="us" and @currentload<15]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml
xmlstarlet sel -t -m '//servers[@currentload<15]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml

http://www.w3schools.com/xpath/default.asp

Edit: If you use json and jshon, then you'll have to do the sorting and filtering in the shell

jshon -e servers -a -e country_code -u -p -e public_name -u -p -e bw -u <vpn.json | {
    # filter country_code
    while read -r country_code && read -r public_name && read -r bw; do
        [ x"$country_code" = x"ca" ] && echo "$public_name $bw";
    done
} | sort -k2,2nr -k1,1d # sort second field (bw) numeric reverse, if equal use dictionary sort on first (name)

Last edited by progandy (2015-05-31 05:47:50)


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#12 2015-05-31 05:54:25

maggie
Member
Registered: 2011-02-12
Posts: 255

Re: Need to parse xml or json file but am not a programmer

Thank you progandy.  I needed the -T switch

xml sel -T -t -m '//servers[@country_code="ca"]' -v @bw -o " " -v @public_name -n file.xml|sort -n

Why didn't my attempt using bulletmark's code work?

xml sel -t -m //servers/servers/[@country_code='ca'] -v @bw -o " " -v @public_name -n file.xml  
Invalid expression: //servers/servers/[@country_code=ca]
compilation error: element for-each
xsl:for-each : could not compile select expression '//servers/servers/[@country_code=ca]'

I tried it with quotes and doublequotes but it still fails.

Last edited by maggie (2015-05-31 05:55:24)

Offline

#13 2015-06-01 18:26:08

thiagowfx
Member
Registered: 2013-07-09
Posts: 586

Re: Need to parse xml or json file but am not a programmer

It seems like you should use this (note I removed one '/'):

//servers/servers[@country_code='ca']

Offline

Board footer

Powered by FluxBB