Блог пользователя demoralizer

Автор demoralizer, история, 5 лет назад, По-английски

I was implementing a Trie for Google Kickstart earlier today and the code is malfunctioning in a very strange way. I have no idea why this is happening. I narrowed down the problem in a few hours.

#include <bits/stdc++.h>
using namespace std;
#define rep(i,a,b)        for(int i=a;i<b;i++)

struct node{
	int ends=0;
	int child[26]={0};
};

vector<node> trie;
int makenode(){
	trie.push_back(node());
	int x=(int)trie.size() - 1;
	cout<<x<<" ";
	return x;
}

void insert(string &s,int root=0,int pos=0){
	if(pos==(int)(s.size())){
		trie[root].ends++;
		return;
	}
	int next=s[pos]-'A';
	if(!trie[root].child[next]){
		// trie.push_back(node()); trie[root].child[next]=(int)trie.size()-1;    //THIS LINE WORKS FINE
		trie[root].child[next]=makenode();
		cout<<trie[root].child[next]<<"\n";
	}
	insert(s,trie[root].child[next],pos+1);
}

int main(){
	trie.clear();
	trie.push_back(node());
	string s;
	cin>>s;
	insert(s);
	return 0;
}

The function makenode() returns x. When I print x, the values are fine but after returning the values are getting changed.

One would expect the above code to print the same integers per line but the values aren't same.

These are the values of x before and after returning from the function makenode().

Does anybody have any idea why this is happening?

(I have noticed that only the values 1,2,4,8,16,32,etc are wrong)

  • Проголосовать: нравится
  • +29
  • Проголосовать: не нравится

»
5 лет назад, # |
  Проголосовать: нравится +65 Проголосовать: не нравится

It's undefined behavior: you take a reference to trie[root] before executing makenode(). The function pushes back to a vector, which can then reallocate. Then the reference you've just taken becomes invalid, so you're writing to some freed memory.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    Oh right! That must be it! Should I reserve extra memory for the vector to avoid this undefined behaviour?

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +21 Проголосовать: не нравится

      That is precisely the problem. You can see it in detail by compiling your program with g++ -fsanitize=address -fsanitize=undefined -D_GLIBCXX_DEBUG and then noting the following lines on running the executable:

      ==6317==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000000048 at pc 0x5555555624c1 bp 0x7fffffffd770 sp 0x7fffffffd760
      WRITE of size 4 at 0x60b000000048 thread T0
          #0 0x5555555624c0 in insert(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, int) /home/gt/test.cpp:25
          #1 0x5555555628dc in main /home/gt/test.cpp:36
      
      0x60b000000048 is located 8 bytes inside of 108-byte region [0x60b000000040,0x60b0000000ac)
      freed by thread T0 here:
          ...
          #8 0x555555562105 in makenode() /home/gt/test.cpp:12
      
      previously allocated by thread T0 here:
          ...
          #8 0x55555556285c in main /home/gt/test.cpp:33
      

      It explicitly states that there was an attempt to reference a pointer after it was freed.

      With this compilation flags you can mostly avoid such undefined behavior in C++ :)

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +6 Проголосовать: не нравится

      IMO you should rather avoid this pattern at all costs, it's going to hurt you when you least expect it. The line you commented out is fine, you could also do something like int node = makenode; trie[root].child[next] = node;.

      (But yes, reserving the memory probably would help you in this case.)

»
5 лет назад, # |
  Проголосовать: нравится +40 Проголосовать: не нравится

As far as I remember in c++17, order of evaluation for assignment operator is forced to be rtl, So just upgrading your compiler may help