I am working on a program that will have several sets of data (in files) and will have to evaluate operations given to it by a script file. The operations are union (+) and Intersection (*). Union is all the words that appear in either set, Intersection is only the words that appear in both sets.
I have gotten the program mostly done, the only problem is that, for some reason, when the sets get fairly large (10000 words or more) it seems to stop working properly, but it works fine with slightly smaller sets. I would post the sets and the output, but they are fairly large and I don't want to clog up the post.
The specific problem is that the expected output of the case that I am working on should have 2584 words, but my output has 3317. The extra words are not duplicates, they are just extras that don't belong there. I thought that it might be either my union or intersection functions, but I have tried 3 different intersection functions and 3 different union functions that all give me the same wrong answer.
Here is the script file I am reading:
1 2 3 4 5 6 7 8
|
read(A,'set6.txt')
read(B,'set7.txt')
read(C,'set8.txt')
read(D,'set9.txt')
R1=(A+B)*C
R2=B*(C+D)
R3=R1+R2
write(R3,'out6.txt')
| |
My program successfully reads this, stores all the names and files, and sends the expressions to my evaluate function, then stores the answers into an array of linked lists that I have. Each text file is also stored into a linked list that it inside an array of linked lists that contains all the data sets.
Here is my favorite union and intersection functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
|
template <class M>
LinkedList<string> LinkedList<M>::Intersection2(LinkedList<string> L2)
{
LinkedList<M> Intersection;
Node<M> *temp;
Node<M> *temp2;
temp = head;
temp2 = L2.head;
while(temp != NULL)
{
if(temp->data < temp2->data)
temp = temp->next;
else if(temp->data > temp2->data)
temp2 = temp2->next;
else if (temp->data == temp2->data)
{
Intersection.push_back(temp->data);
temp = temp->next;
temp2 = temp2->next;
}
}
return Intersection;
}
template <class M>
LinkedList<M> LinkedList<M>::Union(LinkedList<M> L2)
{
LinkedList<string> merged;
Node<M> *temp;
Node<M> *temp2;
temp = head;
temp2 = L2.head;
while(temp != NULL)
{
merged.push_back(temp->data);
temp = temp->next;
}
while(temp2 != NULL)
{
merged.push_back(temp2->data);
temp2 = temp2->next;
}
//merged.noDuplicates();
merged.mergeSort();
//merged.removeDuplicates();
merged.noDups();
return merged;
}
| |
That intersection function seems to be my most efficient, but the problem is that I have to make sure I pass it the largest and smallest set in the right order, and i have no way to check that in my program.
Here is my noDups() function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
|
template <class M>
void LinkedList<M>::noDups()
{
Node<M> *Curr;
Node<M> *trail;
Curr=head;
while(Curr->next != NULL)
{
trail = Curr;
Curr = Curr->next;
if(trail->data == Curr->data)
{
trail->next = Curr->next;
delete Curr;
Curr = trail;
}
}
}
| |
And here is my evaluate function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
|
LinkedList<string> Evaluate(string exp, LinkedList<string> set[], string fnames[], LinkedList<string> expr[], string exprNames[])
{
cout << "Expression: " << exp << endl;
cout << "J: " << j << endl << "M: " << m << endl << "L: " << l << endl << "O: " << o << endl;
Stack<LinkedList<string> > temp(100);
int size = exp.length();
ofstream imp("testing.txt");
//cout << "l: " << l << endl << "J: " << j << endl << "o: " << o << endl;
for (int p = 0; p < j; p++)
{
if(exp.find(fnames[p]) != string::npos)
{
cout << "Found: " << fnames[p] << endl;
temp.push(set[p]);
}
}
for(int y = 0; y < l; y++)
{
if(exp.find(exprNames[y]) != string::npos)
{
cout << "Found: " << exprNames[y] << endl;
temp.push(expr[y]);
}
}
for (int i = 0; i < size; ++i)
{
if (exp[i] == '+' || exp[i] == '*')
{
LinkedList<string> val1 = temp.top();
temp.pop();
val1.outFileF(imp);
// cout << endl << "Val1: " << endl;
// val1.displayF();
LinkedList<string> val2 = temp.top();
temp.pop();
// cout << endl << "Val2: " << endl;
// val2.displayF();
switch (exp[i])
{
case '+':
cout << "==========Before union!" << endl;
temp.push(val1.Union(val2));
//temp.push(val2.getUnion(val1));
cout << "==========after union!" << endl;
break;
case '*':
cout << "---------before intersection" << endl;
//temp.push(val1.Intersection(val2));
temp.push(val2.Intersection2(val1));
//temp.push(val2.getIntersection(val1));
cout << "---------after intersection" << endl;
break;
}
}
}
return temp.top();
}
| |
I'm thinking the problem must be somewhere in the evaluate function, but the cout statements tell me it's using the correct sets and everything. The string arrays that are passed to the evaluate function are parallel arrays that contain the names of the sets. I figured that would be an easy way to search for a specific set.
I'm sorry this post is so long, but I wanted to make sure there was enough info for someone to help me. Thanks in advance!