You are not logged in.
My VPN provider has an api I can use to download the current servers in xml or in json format. I want to parse it and have it show me the top 3 servers based on the current load but after googling linux xml parser tutorial I found the format of the xml file I have is very different than what is expected. Here is a sample.
<?xml version='1.0' standalone='yes'?>
<status result="ok">
<servers>
<servers public_name="Acamar" country_name="United States" country_code="us" location="Miami" continent="America" bw="47" bw_max="1000" users="57" currentload="4" />
<servers public_name="Acrux" country_name="Netherlands" country_code="nl" location="Amsterdam" continent="Europe" bw="25" bw_max="1000" users="20" currentload="2" />
<servers public_name="Acubens" country_name="Sweden" country_code="se" location="Uppsala" continent="Europe" bw="121" bw_max="1000" users="64" currentload="12" />
<servers public_name="Alhena" country_name="Canada" country_code="ca" location="Toronto, Ontario" continent="America" bw="269" bw_max="1000" users="84" currentload="26" />
<servers public_name="Alkaid" country_name="United States" country_code="us" location="Chicago, Illinois" continent="America" bw="210" bw_max="1000" users="88" currentload="21" />
</servers>
</status>
The tutorials I found expect these things like "public_name" and "bw" to be inside their own entries like <public_name>Acamar</public_name>. Can someone recommend a good tool that doesn't need me to know perl or python I can use to: pull out all servers in Canada with bw_max of 1000 for example?
Last edited by maggie (2015-05-30 11:49:45)
Offline
Or JSON?
EDIT: If you have access to JSON data, use JSON. It was made, because XML is as human readable as ancient Phoenician with bad spelling.
Last edited by Awebb (2015-05-30 12:16:50)
Offline
awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2; }' your_file.xml
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
Yes, I can get json but the format looks different than I expected.
http://pastebin.com/gcQrcNAW
Offline
Thank you Trilby. Can awk sort them by a third value "bw" or maybe it is better to use /usr/bin/sort with more output? The bw value is the load and lower is better.
Offline
Looks like plain old JSON to me. Have you tried JSHON?
Offline
As teh xml has spaces in the values fields e.g. 'Toronto, Ontario', using awk like this won't work in every situation:
$ awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }'
Alhena 269
$ awk '/bw_max="1000"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }' | sort -nk 2
Alkaid America
Acamar 47
Alhena 269
Acrux 1000
Acubens 1000
but will work if your every Canadian server has location in the "foo, bar" format:
$ awk '/bw_max="1000"/ && /country_name="Canada"/ { print $2,$8; }' test.xml | awk -F '"' '{ print $2,$4; }' | sort -nk 2
foo1 26
foo3 29
foo4 69
Alhena 269
foo2 269
foo6 629
foo5 9001
Offline
Install xmlstarlet from the standard packages and then use something like:
xmlstarlet sel -t -m //servers/servers -v @bw -o " " -v @public_name -n file.xml | sort -n
An advantage of this approach is that it will accept any arbitrary formatted xml input (so long as it is valid of course).
Offline
Hooray. Thank you everyone. Karol's code is working.
@bulletmark - Thank you for that usage of xmlstarlet. I cannot figure out how to add in the ability to match the "country_code=ca" into your example though.
Offline
@maggie, I could tell you but the xmlstarlet documentation, and the articles it links to, is pretty darn good.
Offline
@maggie: That is easily done with XPath expressions, for example:
xmlstarlet sel -t -m '//servers[@country_code="us"]' -c '.' -n vpn.xml
xmlstarlet sel -T -t -m '//servers[@country_code="us"]' -v @public_name -o " " -v @bw -n vpn.xml
xmlstarlet sel -t -m '//servers[@country_code="us"]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml
xmlstarlet sel -t -m '//servers[@country_code="us" and @currentload<15]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml
xmlstarlet sel -t -m '//servers[@currentload<15]' -v '@public_name' -o ": " -v '@currentload' -n vpn.xml
http://www.w3schools.com/xpath/default.asp
Edit: If you use json and jshon, then you'll have to do the sorting and filtering in the shell
jshon -e servers -a -e country_code -u -p -e public_name -u -p -e bw -u <vpn.json | {
# filter country_code
while read -r country_code && read -r public_name && read -r bw; do
[ x"$country_code" = x"ca" ] && echo "$public_name $bw";
done
} | sort -k2,2nr -k1,1d # sort second field (bw) numeric reverse, if equal use dictionary sort on first (name)
Last edited by progandy (2015-05-31 05:47:50)
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |
Offline
Thank you progandy. I needed the -T switch
xml sel -T -t -m '//servers[@country_code="ca"]' -v @bw -o " " -v @public_name -n file.xml|sort -n
Why didn't my attempt using bulletmark's code work?
xml sel -t -m //servers/servers/[@country_code='ca'] -v @bw -o " " -v @public_name -n file.xml
Invalid expression: //servers/servers/[@country_code=ca]
compilation error: element for-each
xsl:for-each : could not compile select expression '//servers/servers/[@country_code=ca]'
I tried it with quotes and doublequotes but it still fails.
Last edited by maggie (2015-05-31 05:55:24)
Offline
It seems like you should use this (note I removed one '/'):
//servers/servers[@country_code='ca']
Offline