You are not logged in.
A while ago I uploaded to Neocities a list of games. Recently I discovered this template system called Mustache, so I decided to put all the data onto a JSON and let the C implementation of it generate the Web page for me. It produced a visually identical version with this:
$ cat list-of-vehicle-building-games.mustache
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<link rel="icon" type="image/svg+xml" href="./logo.svg">
<title>{{title}} - aqueduct1</title>
</head>
<body>
<h1>{{title}}</h1>
<ul>{{#games}}
<li>
{{name}}
{{#resources}}[<a href="{{url}}">{{name}}{{#index}}<sup>{{.}}</sup>{{/index}}</a>{{#index}}<sup>{{/index}}{{#archive}} (archive){{/archive}}{{#mention}} (mention){{/mention}}{{#index}}</sup>{{/index}}{{#alts}}
<sup><a href="{{url}}">{{index}}</a>{{#archive}} (archive){{/archive}}{{#mention}} (mention){{/mention}}</sup>{{/alts}}]
{{/resources}}
{{{notes}}}
</li>
{{/games}}</ul>
[...]
</body>
</html>
But to get the JSON... I suffered. This cursed Bash script using html-xml-utils, grep and sed sput a broken JSON that saved me a lot time, but also required a lot of manual corrections until jq stopped complaining:
$ cat extract-games-to-json
#!/bin/bash
shopt -s extglob
json_games=()
IFS=$'\n' resources=($(cat list-of-vehicle-building-games.html | hxremove sup | hxselect -s "\n" -c ul li a | sort | uniq))
IFS=$'\136' games=($(cat list-of-vehicle-building-games.html | hxselect -c -s $'\136' li ))
echo -en "{\n"
echo -en "\t\"title\" : \"$(cat list-of-vehicle-building-games.html | hxselect -c h1)\",\n"
echo -en '\t"games" : ['
for game_i in "${!games[@]}"; do
IFS=$'\n' game=($(echo "${games[game_i]}"))
unset "game[${#game[@]} - 1]"
name="${game[0]##*([[:space:]])}"
notes="${game[${#game[@]} - 1]##*([[:space:]])}"
unset "game[0]"
json_games[game_i]="\n\t\t{\n\t\t\t\"name\" : \"$name\""
if [[ "$notes" =~ ^\(.*\)$ ]]; then
json_games[game_i]+=",\n\t\t\t\"notes\" : \"$(echo "$notes" | sed 's/"/\\"/g')\""
unset "game[${#game[@]} - 1]"
fi
if [[ ${#game[@]} -gt 0 ]]; then
json_games[game_i]+=",\n\t\t\t\"resources\" : ["
resources=()
for resource in "${game[@]}"; do
if [[ "$resource" =~ [.*] ]]; then
resources+=" {\"url\" : $(echo "$resource" | grep -o '".*"'), \"name\" : \"$(echo "$resource" | grep -o '>.*<')\"}"
fi
done
json_games[game_i]+=$(IFS=, echo -n "${resources[*]}")
json_games[game_i]+="\n\t\t\t]"
fi
json_games[game_i]+="\n\t\t}"
done
(IFS=,; echo -en "${json_games[*]}")
echo -en '\n\t]\n}'
At this point I contemplated Python... I think I'll give it a shot. Sample of the hairiest parts of the JSON:
$ jq '.games[] | select(.name == "Fraxy")' list-of-vehicle-building-games.json
{
"name": "Fraxy",
"resources": [
{
"url": "https://web.archive.org/web/20200129130946/http://monz.sp.land.to/wp/fraxy/",
"name": "site",
"archive": true,
"index": 1,
"alts": [
{
"index": 2,
"url": "http://fraxyhq.net/",
"archive": false
}
]
},
{
"url": "https://shmup.fandom.com/wiki/Fraxy",
"name": "wiki",
"index": 1,
"alts": [
{
"index": 2,
"url": "https://tig.fandom.com/wiki/Fraxy"
},
{
"index": 3,
"url": "https://web.archive.org/web/20100310233900/http://fraxy.kafuka.org/wiki/Main_Page",
"archive": true
},
{
"index": 4,
"url": "https://web.archive.org/web/20141218150554/http://wiki.fraxy.net/index.php?title=Main_Page",
"archive": true
},
{
"index": 5,
"url": "https://web.archive.org/web/20090317233751/http://fraxycompendium.pbwiki.com/",
"archive": true
},
{
"index": 6,
"url": "http://fraxyacademy.pbworks.com/w/page/8284158/FrontPage"
},
{
"index": 7,
"url": "http://fraxy.pbworks.com/w/page/8284047/The%20Bosses"
}
]
},
{
"url": "http://fraxyhq.net/forums/index.php",
"name": "forum",
"index": 1,
"alts": [
{
"index": 2,
"url": "https://web.archive.org/web/20180911174817/http://acmlm.kafuka.org/board/forum.php?id=51",
"archive": true
},
{
"index": 3,
"url": "https://web.archive.org/web/20120313164954/http://fraxy.forumi.biz/",
"archive": true
},
{
"index": 4,
"url": "https://web.archive.org/web/20121107142027/http://fraxyhq.com:80/forums/index.php",
"archive": true
}
]
},
{
"url": "https://www.youtube.com/results?search_query=%22fraxy%22",
"name": "YouTube"
},
{
"url": "https://tvtropes.org/pmwiki/pmwiki.php/VideoGame/Fraxy",
"name": "TV Tropes"
}
]
}
$ jq '.games[] | select(.name == "Block Tech Sandbox")' list-of-vehicle-building-games.json
{
"name": "Block Tech Sandbox",
"notes": "(aka Block Tech: Epic Sandbox, or Block Tech: Epic Sandbox Craft Simulator Online)",
"resources": [
{
"url": "https://play.google.com/store/apps/details?id=com.NGG.BlockTech",
"name": "Play Store",
"index": "free",
"alts": [
{
"index": "gold",
"url": "https://play.google.com/store/apps/details?id=com.NGG.BlockTechGold"
}
]
},
{
"url": "https://apps.apple.com/app/block-tech-sandbox-online/id1465592382",
"name": "App Store"
},
{
"url": "https://www.crazygames.com/game/block-tech-epic-sandbox",
"name": "Crazy Games"
},
{
"url": "https://www.silvergames.com/block-tech-epic-sandbox",
"name": "Silver Games"
}
]
}
Maybe I could've learned some of those Python libraries. Maybe I missed some program that could've been more useful. Maybe I need to learn more Bash. What do you think?
Behemoth, wake up!
Offline
I would have entered this data in a spreadsheet, exported it as CSV or TSV and then created the HTML with Python. You are making it unnecessarily difficult by storing data in JSON that is basically just a big table.
And I don't think bash is ever the correct answer to any programming problem.
Offline
By muscle memory, I like python for managing json, since the concepts of it translate nicely to python, at least for my personal experience.
Offline