There's a competitive programming problem that reads as follows:
"In the beginning there are 'n' empty rows. In each step either a number is added to the end of each row, or several numbers are deleted from the front of the row and you should print the sum of the deleted numbers (note that the row might become empty).
Input is in the following form:
First line contains two numbers 'n' and 'q' which represent the number of rows and events respectively.
In the next q lines one of the following two options will appear:
A) '1 x' which means that x should be added to the end of all rows.
B) '2 i j' which means that j elements should be deleted from the front of the row i (0<= j <= row's size).
Expected Output:
For each input of type B (i.e. '2 i j') the sum of deleted numbers should be printed."
Sample Input:
2 5
1 5
1 17
2 1 1
1 1
2 2 3
Sample Output:
5
23
Constraints:
Time: 1 Second
Memory: 256 MB
1 <= n, q <= 300 000
1 <= i <= n
1 <= x <= 10^9
#include <bits/stdc++.h>
usingnamespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int q; cin >> q >> q; //Ignored the 'n' input since I didn't need it
queue<int> reference_line; //Since all the rows are the same I created as the reference row which represents all the rows (in case they are unchanged though)
map<int, queue<int> > line; //I use this map to save the rows that have some of their elements deleted. The key represents row index
while (q--) {
int v; cin >> v;
if (v == 1) {
cin >> v;
reference_line.push(v);
}
else {
cin >> v;
if (line.find(v) == line.end()) line[v] = reference_line;
int j; cin >> j;
int sum = 0;
while (j-- && !line[v].empty()) {
sum += line[v].front();
line[v].pop();
}
cout << sum << '\n';
}
}
return 0;
}
In addition to what Repeater said, you could also store the running sums instead of the numbers themselves. E.g., suppose the input numbers were:
4, 2, 3, 1, 5, 7, 3
You store the running sum like this:
4, 6, 9, 10, 15, 22, 25
Then if you need the sum from index 0 to 3 (incl.) you just print the value at index 3. If you need index 2 to index to index 6 you just print the value at index 6 minus the value at index 1.
Also note that the sums need to be long long since they could be as high as 1e9 * 3e5 = 3e14.
#include <bits/stdc++.h>
usingnamespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q; cin >> n >> q;
vector<int> reference_line;
int line[n] = {};
while (q--) {
int v; cin >> v;
if (v == 1) {
cin >> v;
reference_line.push_back(v);
}
else {
cin >> v; v--;
int j; cin >> j;
int sum = 0;
for (int i = line[v]; i < line[v] + j; i++) {
if (i >= reference_line.size()) break;
sum += reference_line[i];
}
line[v] = j;
cout << sum << '\n';
}
}
return 0;
}
reserve reference_line's size (this is probably the biggest tweak)
move the if/break to just for (... extra stop condition aware of short circuit) and rearrange the logic as necessary (this may or may not help, unclear how compiler manages this)
verify cout sum is not a bottleneck and if it is, handle it differently (probably not an issue)
its hard to say. vector has a built in algorithm to attempt to manage memory sensibly, but every time you exceed the current size it does all the usual pointer junk:
creates a new pointer and allocates memory (more than current, + some amount to allow growth).
copies all the data to the new location
updates vector's pointer to new one
deletes old pointer.
the copy is bad enough, but getting and freeing memory is usually "slow" (relatively in terms of cpu clock cycles) as well as it goes through the OS to do it, which probably goes through interrupts and context switchs and all that sort of badness. Just the memory management is also fed thru an algorithm to decide which block to give you.
not every time in the loop, but doing this even once stinks when you are trying to beat a time constraint.
also push back costs more than vector[x] = value. not much more, but it all adds up. here, its probably a zero sum as you will have to track its used/actual size yourself. If you didn't need its size, it would save having the object track that... if you could rewrite it so you didnt need size (I don't see a way here, this is more useful when you have a fixed amount of stuff).
there are a couple of other nanosecond things you could try. How long does it take currently? can you profile it and spot the issue? I personally struggle to trust compilers because they used to be bad at their job. It shouldn't matter, but another thing I would have done here is declare ALL the variables at the program start, even the loop variable. Its probably optimized away, but just in case the compiler is feeling dumb today I would do that to avoid it creating and destroying (this is just a push/pop of the program stack I believe here) them each iteration. We are getting into the weeds here.... you might also eyeball the assembly listing for anything dumb it might be doing. This code is small enough that you can rewrite / modify it a couple of ways to see if you can stumble on a form that the compiler is better able to optimize.
Well, I read a bit more about vector memory allocation after you told me, and you're totally right, reserving it would be great and much more optimized.
However, in my last submission I didn't even use vectors (just normal arrays) and made use of running total as @tpb mentioned earlier, and it got "Accepted".
Regardless, it was a great lesson. Thanks for sharing your thoughts, really appreciate it.
Re-read tpb's post. No need to worry about memory allocation or context switches, if you implement it that way tpb says, each operation, whether type 1 or type 2, will require constant time.