You are not logged in.

#1 2020-05-29 12:34:33

CarterCox
Member
From: Argentina
Registered: 2018-02-24
Posts: 116

[SOLVED] Create a text tree from a text file

Hi everyone!

I am trying to do something and I find myself lacking some tools and knowledge.

I have a file that looks like this (i'll use the "List of applications" https://wiki.archlinux.org/index.php/Li … plications page from the wiki as an example):

[0, 'Contents', 'Contents', null]
[1, 'Internet', 'Internet', 3242]
[2, 'Network conection', 'Network_connection', 24234]
[3, 'Network managers', 'Network_managers', 21312]
[3, 'VPN clients', 'VPN_clients', 12321]
-----snip-----

And what I'm trying to do is obtain something that would look like this:

Contents
|-Internet
| |-Network connection
| | |-Network managers
| | |-VPN clients

The first problem I run into is using awk. I could do

$ awk '{print $1 $2}' Arch.txt 
[0,'Contents',
[1,'Internet',
[2.'Network
[3,'Network
[3,'VPN

but I'd be losing some data because the second and 3rd lines of some of the items would dissapear, for example "VPN clients" gets snipped to "VPN," and that won't work. How can I modify this awk command to get what I want?

Later I could use sed to remove all the symbols, but I can worry about that later.

The second problem is that I'm not sure how I can use the numbers to turn them into symbols to represent the different levels. For example I could turn a 1 into "|-", a 2 into "| |-" and so on, and I think that would do it just fine. However, I'm not sure what I can use to do this. Do you have any ideas?

I could also settle for turning numbers into a fixed ammount of, for example, dashes, so it would look like this:

Contents
-Internet
--Network connection
---Network managers
---VPN clients

Last edited by CarterCox (2020-05-29 13:16:16)


And neither the angels in Heaven above
   Nor the demons down under the sea
Can ever dissever my soul from the soul
   Of the beautiful Annabel Lee;

Offline

#2 2020-05-29 12:53:31

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,442
Website

Re: [SOLVED] Create a text tree from a text file

To do exactly what you are asking, this would get the job done:

gawk -F"'" '{
	patsplit($1, arr, "[0-9][0-9]*");
	for (i = 1; i < arr[1]; ++i)
		printf "| "
	if (arr[1] > 0) printf "|-";
	print $2;
}' infile

But I suspect there might be a bit of an X-Y question.  Is your input data actually coming from the wiki or what's the initial source?  There may be better ways to get what you want.

If we can assume that the dot after the "2" in your input was a typo and was actually a comma, this would be slightly better (and more portable):

awk -F"[[,]" '{
	for (i = 1; i < $2; ++i)
		printf "| ";
	if ($2 > 0) printf "|-";
	split($0, arr, "'"'"'");
	print arr[2];
}' infile

Last edited by Trilby (2020-05-29 13:05:03)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#3 2020-05-29 13:02:54

Awebb
Member
Registered: 2010-05-06
Posts: 6,272

Re: [SOLVED] Create a text tree from a text file

Google:
awk split columns by delimiter

Your delimiter is ,

Get rid of the [ and ] first. One way of getting rid of something is replacing it with nothing.

In awk, you print a string n times with a for loop (google: awk print string n times), and you already have the number in $1, if you followed my delimiter hint, so there is a chance you can print - $1 times.

That's awk.

@edit: You gave that man a fish, Trilby :-P

Last edited by Awebb (2020-05-29 13:04:20)

Offline

#4 2020-05-29 13:08:54

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,442
Website

Re: [SOLVED] Create a text tree from a text file

Awebb wrote:

You gave that man a fish, Trilby :-P

Yes, but I'm pretty sure he's actually looking for chicken, but just doesn't know it yet.  So rather than spending a fair bit of time on a fishing lesson only for him to come back with his first fish troubled that that's not what he really wanted, I figure it's better to offer him the fish right off the bat (or slap him with it if need be) and then address the real question that comes after "no, not that kind of fish, the kind that clucks and has feathers.  How do I ..."


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#5 2020-05-29 13:15:58

CarterCox
Member
From: Argentina
Registered: 2018-02-24
Posts: 116

Re: [SOLVED] Create a text tree from a text file

Trilby wrote:

To do exactly what you are asking, this would get the job done:

gawk -F"'" '{
	patsplit($1, arr, "[0-9][0-9]*");
	for (i = 1; i < arr[1]; ++i)
		printf "| "
	if (arr[1] > 0) printf "|-";
	print $2;
}' infile

But I suspect there might be a bit of an X-Y question.  Is your input data actually coming from the wiki or what's the initial source?  There may be better ways to get what you want.

If we can assume that the dot after the "2" in your input was a typo and was actually a column, this would be slightly better (and more portable):

awk -F"[[,]" '{
	for (i = 1; i < $2; ++i)
		printf "| ";
	if ($2 > 0) printf "|-";
	split($0, arr, "'"'"'");
	print arr[2];
}' infile

It is not the wiki. It is a .js file from a much more complicated website that involves dynamic elements. The list was obtained by miroring the site with wget and finding the file manually by sheer luck, which is why I don't know of a better solution to obtain a list formatted in other ways.

This is the site, just in case: http://www.thingsmadethinkable.com/item … wledge.php

The closest I got to a list was this (fields_of_knowledge.js):

[0, 'Fields of Knowledge', 'List_of_academic_disciplines_and_sub-disciplines', null],  // root
    [1, 'Humanities', 'Humanities', 168169],
    [2, 'History', 'History', 238581],
    [3, 'African history', 'African_history', 451689],
    [3, 'American history', 'American_history', 466828],
    [3, 'Ancient history', 'Ancient_history', 426924],
    [3, 'History of Asia', 'History_of_Asia', 172584],
    [3, 'History of Europe', 'History_of_Europe', 418591],
    [3, 'Chinese history', 'History_of_China', 311422],
    [3, 'Cultural history', 'Cultural_history', 61388],
    [3, 'Diplomatic history', 'Diplomatic_history', 87965],
    [3, 'Economic history', 'Economic_history', 116080],
    [3, 'Ethnohistory', 'Ethnohistory', 52718],
    [3, 'Greek history', 'Ancient_Greece', 252811],
    [3, 'History of education', 'History_of_education', 254377],
    [3, 'History of science and technology', 'History_of_science_and_technology', 169771],
    [3, 'Iranian history', 'History_of_Iran', 395698],
---snip---

The code you gave worked extremely well. Thank you very much.

PS: The . was actually a typo. It's fixed now.


And neither the angels in Heaven above
   Nor the demons down under the sea
Can ever dissever my soul from the soul
   Of the beautiful Annabel Lee;

Offline

#6 2020-05-29 13:19:26

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,442
Website

Re: [SOLVED] Create a text tree from a text file

Ha ... there it is.  I suspected it was actually json data.  In that case there are much better and more robust ways.  What you posted in your initial post was not quite json data - so we had to just treat it as text (I even considered using text processing tools to turn it into proper json first).

But hey, if you're ok with your fish ...

When you realize that's not what you really wanted, you should read up on `jq` or python's json module.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#7 2020-05-29 13:27:03

CarterCox
Member
From: Argentina
Registered: 2018-02-24
Posts: 116

Re: [SOLVED] Create a text tree from a text file

Trilby wrote:

Ha ... there it is.  I suspected it was actually json data.  In that case there are much better and more robust ways.  What you posted in your initial post was not quite json data - so we had to just treat it as text (I even considered using text processing tools to turn it into proper json first).

But hey, if you're ok with your fish ...

When you realize that's not what you really wanted, you should read up on `jq` or python's json module.

I imagined the more knowledgeable people here would realize it was a js file. It truly was an X-Y question, but it really got the job done. So... yeah, thanks again.


And neither the angels in Heaven above
   Nor the demons down under the sea
Can ever dissever my soul from the soul
   Of the beautiful Annabel Lee;

Offline

Board footer

Powered by FluxBB