Hmm, huge files... If all you are doing is
removing data, then you've got it made.
A B C D E F G H I J K & & & L M N O P Q R S T U V W X Y Z
To remove: & & &
Needed: A buffer at least as large as the stuff to remove (preferrably as large as possible). For this example, I'll use a 6 character buffer.
Step 1: find the stuff to remove:
A B C D E F G H I J K & & & L M N O P Q R S T U V W X Y Z
^
[. . . . . .]
Step 2. read past it and note its length
A B C D E F G H I J K & & & L M N O P Q R S T U V W X Y Z
^ length = 3
[. . . . . .]
Step 3. read till your buffer is full or EOF. Remember how many you read.
A B C D E F G H I J K & & & L M N O P Q R S T U V W X Y Z
^
[L M N O P Q]
Step 4. seek
backward (removed-length + buffer-length)
A B C D E F G H I J K & & & L M N O P Q R S T U V W X Y Z
^ seek( -9 )
[L M N O P Q]
Step 5. write your buffer
A B C D E F G H I J K L M N O P Q O P Q R S T U V W X Y Z
^
[L M N O P Q]
Step 6. seek forward the (removed-length)
A B C D E F G H I J K L M N O P Q O P Q R S T U V W X Y Z
^ seek( 3 )
[L M N O P Q]
Step 7. goto step 3 and repeat until eof
A B C D E F G H I J K L M N O P Q O P Q R S T U V W X Y Z
^
[R S T U V W]
A B C D E F G H I J K L M N O P Q O P Q R S T U V W X Y Z
^
[R S T U V W]
A B C D E F G H I J K L M N O P Q R S T U V W U V W X Y Z
^
[R S T U V W]
A B C D E F G H I J K L M N O P Q R S T U V W U V W X Y Z
^
[R S T U V W]
A B C D E F G H I J K L M N O P Q R S T U V W U V W X Y Z
^ EOF
[X Y Z . . .]
Step 8. remember that you may have not filled the buffer entirely when you hit EOF. Seek backward (removed-length + bytes-read)
A B C D E F G H I J K L M N O P Q R S T U V W U V W X Y Z
^ seek( -6 )
[X Y Z . . .]
Step 9. write what remains
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z X Y Z
^
[X Y Z . . .]
Finally, you may notice that there are still extra data at the end of the file. There is no
standard way to truncate the file. You will have to use an OS-specific method. In any case, first
close the file. We will be using OS-specific streams now.
Win32
Compile with or without Unicode support. Works for C or C++.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
|
#include <stdint.h>
#ifndef uint64_t
#error I need uint64_t from <stdint.h> (ISO C99 and ISO C++)
#endif
#include <windows.h>
BOOL shorten_file( LPCTSTR filename, long bytes )
{
HANDLE filehand;
uint64_t filesize;
DWORD result;
WIN32_FILE_ATTRIBUTE_DATA filedata;
/* Learn the file's length */
if (!GetFileAttributesEx( filename, GetFileExInfoStandard, &filedata )) return FALSE;
filesize = (filedata.nFileSizeHigh << 32) +filedata.nFileSizeLow;
/* Calculate the new length */
filesize -= bytes;
filedata.nFileSizeHigh = filesize >> 32;
filedata.nFileSizeLow = filesize & 0xFFFFFFFFL;
/* Modify the file's length */
filehand = CreateFile(
filename,
GENERIC_READ|GENERIC_WRITE,
0,
NULL,
OPEN_EXISTING,
FILE_FLAG_RANDOM_ACCESS,
NULL
);
if (filehand == INVALID_HANDLE_VALUE) return FALSE;
result = SetFilePointer(
filehand,
filedata.nFileSizeLow,
&filedata.nFileSizeHigh,
FILE_BEGIN
) != 0xFFFFFFFF;
if (result) result = SetEndOfFile( filehand );
CloseHandle( filehand );
return result;
}
| |
Linux
You need large file support (LFS) to work with >2GB files, both in the Kernel and glibc, which I presume you already know.
Compile with
-D_FILE_OFFSET_BITS=64 if you know for sure that will work, or to be more general, make sure to use
getconf. See
http://www.suse.de/~aj/linux_lfs.html
Modern Linuxes take UTF-8, so the first argument will work for both ASCII and Unicode. Works with C or C++. Requires XSI-compliance.
1 2 3 4 5 6 7 8 9 10 11 12
|
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
int shorten_file( const char *filename, long bytes )
{
off_t filesize;
struct stat filedata;
if (stat( filename, &filedata )) return 0;
filesize = filedata.st_size -bytes;
return truncate( filename, filesize ) == 0;
}
| |
Whew. Good luck!