[Solved] Use of pointers to hierarhical data structures in C++

rWarrior · 2011-03-18 04:57:05

I am having a hard time deciding whether to use pointers or not in my nested data structures.
Memory and CPU efficiencies are important as I am dealing with data files of sizes on the order of gigabytes.

I have decided to use several classes that are nested in a hiearchical manner to make the code more organized and re-usable.

Here is a simplified version of my code:

struct Segment
{
  int start, end, value;
};

class Set
{
private:
  vector<Segment*>* segments;
public:
  Set() {
    segments = new vector<Segment*>();
  }
  ~Set() {
    for (size_t i = 0; i < segments->size(); ++i) {
      delete segments->at(i);
    }
    delete segments;
  }
  vector<Segment*>* getSegments() {
    return segments;
  }
  // accessor functions...
};

class Collection
{
private:
  vector<Set*>* sets;
public:
  Collection() {
    sets = new vector<Set>();
  }
  ~Collection() {
    for (size_t i = 0; i < sets->size(); ++i) {
      delete sets->at(i);
    }
    delete sets;
  }
  void push_back(Set* set) {
    sets->push_back(set);
  }
  // other accessor functions...
};

int main()
{
  Collection collection;
  // open file, read file
  Set* set;
  while (!f.eof()) {
    Segment* segment = new Segment();
    // fill segment
    // instantiate set using "new" as needed, and fill it with segments
    collection.push_back(set);
  }
  
  // do other stuff
  
  return 0;
}

The main motivation to use pointers is so that the child data structures can be built up while reading in the data file, and a pointer is simply passed to the parent data structures, avoiding using accessor functions and copying pieces of data redundantly.

My two main concerns are:

1. Memory leak. Does the order in which the class destructors are called ensure that there is no memory leak? When the collection destructor is called, I needed to "delete sets->at(i)" to deallocate the Set objects allocated elsewhere, but does the Set destructor get called, or does a memory leak result?

2. Kdevelop using gdb does not resolve the pointers, so I cannot see the content of the Collection object during debugging. I have to print them out, which can be cumbersome.

Is there a better way to do this?
(Besides using several global vectors to store different data types and hope that I remember to process related data concurrently...)

Last edited by rWarrior (2011-03-18 23:32:02)

n0stradamus · 2011-03-18 09:47:44

I can't really answer your first question, as I tend to avoid using pointers in favor of references. I can't tell which method is faster.
If you use the delete operator, the corresponding destructor is always called before memory deallocation.
In your case every "delete sets->at(i)" statement inside the simple for-loop in the Collection class destructor would expand to the call of the Set class destructor:

for (size_t i = 0; i < segments->size(); ++i) {
      delete segments->at(i);
    }

As structs don't have constructors that are called on instantiation (the memory is allocated and that's it - setting default values requires an extra function that is called explicitly) no destructor is called for them. The memory is simply deallocated. Putting it short: There won't be a memory leak. As far as I can see, that is. If you still have doubts about that, you may need to have a look at valgrind and similar tools.

Concerning your second question about being able to dereference pointers while debugging: I use QtCreator, which can also be used for non-Qt-dependant applications. The QtCreator-frontend for gdb allows you to dereference pointers (even automatically). As you're using KDE anyway, Qt will probably installed.
I haven't used KDevelop yet, so I can't tell how big a change it is.

Hope I could help,

n0stradamus

davvil · 2011-03-18 12:42:20

As far as I can see, there should be no memory leaks if you use the classes as in the example. Well, it's not so clear what happens with the segment created in the loop of the main function, but this is probably due to the simplified example.

What in my experience can be an ugly source of headaches is splitting the allocation and deallocation in different classes. When you first write your code it's clear what's happening, but when it grows it is easy to loose track and end up with nasty bugs which are difficult to find.

I would recommend to use smart pointers. They have some small overhead, but in most situation it is not noticeable. They are not difficult to implement, and there are also some libraries which include them (e.g. boost). Or I can give you a pointer (pun intended ) to a nearly free project I work on, where we have such a class which should be easy to isolate from the rest of the code. As long as your program is for non-commercial purposes it should be ok with the license.

rWarrior · 2011-03-18 23:31:17

Thank you for your feedbacks.

I've decided to avoid making everything pointers.

I also agree that classes should create their own objects and destroy these objects themselves.
I originally decided that it was alright to use allocate nested class objects outside of the class and deallocate them at in the class destructor, but there are probably chances for mistakes to happen.

Now the classes allocate their own objects, and pass the pointer to the caller, so that the caller can populate the object without having to use accessor functions.

smart pointers are definitely an interesting thought... though I might start relying on them too much and start coding like a Java programmer

class Collection
{
private:
  vector<Set*> sets;
public:
  Collection() {}
  ~Collection() {
    for (size_t i = 0; i < sets.size(); ++i) {
      delete sets[i];
    }
  }
  Set* create(string& name) {
    Set* set = find(name);
    if (set == NULL) {
      Set* set = new Set();
      sets.push_back(set);
    }
    return set;
   }
   Set* find(string& name) {
     // return pointer to set with $name; return NULL if not found
  }
};

Last edited by rWarrior (2011-03-18 23:36:24)

the_isz · 2011-03-19 01:34:19

rWarrior wrote:

Thank you for your feedbacks.
I've decided to avoid making everything pointers.
I also agree that classes should create their own objects and destroy these objects themselves.
I originally decided that it was alright to use allocate nested class objects outside of the class and deallocate them at in the class destructor, but there are probably chances for mistakes to happen.
Now the classes allocate their own objects, and pass the pointer to the caller, so that the caller can populate the object without having to use accessor functions.
smart pointers are definitely an interesting thought... though I might start relying on them too much and start coding like a Java programmer
class Collection
{
private:
  vector<Set*> sets;
public:
  Collection() {}
  ~Collection() {
    for (size_t i = 0; i < sets.size(); ++i) {
      delete sets[i];
    }
  }
  Set* create(string& name) {
    Set* set = find(name);
    if (set == NULL) {
      Set* set = new Set();
      sets.push_back(set);
    }
    return set;
   }
   Set* find(string& name) {
     // return pointer to set with $name; return NULL if not found
  }
};

You have much to learn, young Padawan.

Smart pointers are your friends, not your foes. In your specific example, you'd
have to precisely document how the pointer returned from Collection::create was
allocated and when it is going to be deleted (i.e. upon the destruction of the
Collection object). This is very dangerous, because it can easily create so
called dangling pointers. What if someone were to store the pointer returned
(see below)?

Smart pointers help you describe to your class' user how and when they will be
deleted:

std::auto_ptr:
The ownership of the pointer is transferred to the caller.
boost::shared_ptr:
The pointer is deleted when the last reference to it is deleted, i.e. as long as
the caller keeps a reference, it is safe to dereference.

boost::weak_ptr:

This is most like a dull pointer, but you can check if it has already been
deleted. This is not possible with a dull pointer:

Set* set = 0;

{
  Collection collection;
  set = collection.create("hello");
}

if (set)
{
  // Will always be reached, although set has been deleted already.
}

Hope this helped you a bit

Arch Linux

#1 2011-03-18 04:57:05

[Solved] Use of pointers to hierarhical data structures in C++

#2 2011-03-18 09:47:44

Re: [Solved] Use of pointers to hierarhical data structures in C++

#3 2011-03-18 12:42:20

Re: [Solved] Use of pointers to hierarhical data structures in C++

#4 2011-03-18 23:31:17

Re: [Solved] Use of pointers to hierarhical data structures in C++

#5 2011-03-19 01:34:19

Re: [Solved] Use of pointers to hierarhical data structures in C++

Board footer