CHOP Version 3.1 Author: Walter J. Kennamer Compuserve PPN: 74025,514 CHOP breaks big files into smaller ones. A number of options are supported to determine exactly where the breaks take place. This version also allows you to extract a portion of a file. More information about CHOP follows; but first, a word from my lawyers: ------------------------------------------------------------------------------ Copyright (c) 1986, 1987 Walter J. Kennamer. All Rights Reserved. You are free to use, copy and distribute CHOP providing that: NO FEE IS CHARGED FOR USE, COPYING OR DISTRIBUTION. IT IS NOT MODIFIED IN ANY WAY. THIS DOCUMENTATION FILE (UNMODIFIED) ACCOMPANIES ALL COPIES. This program is provided AS IS without any warranty, expressed or implied, including but not limited to fitness for a particular purpose. ------------------------------------------------------------------------------ Usage: A>CHOP infile [-switches] "Infile" is a unambiguous file name (i.e., no wildcards are allowed). The original input file will be unchanged. The output files will have the same stem as the input file, but the extension will be numbered consecutively from 1. For example, if you break FOO.BAR into three smaller files, FOO.BAR will be unchanged, and there will be three output files--FOO.1, FOO.2 and FOO.3. Each switch requires a separate switch character (e.g., "-a -b" rather than "-ab"). Switch order does not matter, unless you enter mutually- exclusive switches, in which case the last one takes precedence. Type CHOP by itself to see help. ============================== Command line options: ============================== This table summarizes command line options. Options can be preceded by a hyphen(-) or a slash(/) and can be upper or lower case. These options determine how much of the file will be output -Bx Beginning byte to extract (default = 1). -Ex Ending byte to extract (default = end of file). These options determine how the file will be partitioned. They are mutually exclusive. -Px Chop file into x pieces (default = 2). -Sxxx Chop file into xxx-sized pieces. These options determine where the data comes from and where it goes. -Ifilename Read input from "filename". -Odirectory Send output to "directory". -T Trample over existing files. These options determine if the break will occur at a precise byte or at a set of characters near the computed boundary. -R and -X are mutually exclusive. -N, -A, -H, -L, -C and -W have no meaning if -X is selected. -R is the default option. The term "return character(s)" means the character(s) that determine where the file will be chopped. -R Try to chop at a "return" character (default is CR/LF). -Nfoo Define a sequence of "return" characters (e.g.,"foo"). -A Chop after the "return" characters (default). -H Chop before the "return" characters. -Lxxx Limit search for "return" characters to xxx bytes. -C Make "return" characters case sensitive. -W Chop at each occurrence of the "return" string. -X Chop at the exact computed byte. -Mxxx Define the maximum number of chops (default = 256). -Gxxx Start output file numbering with xxx. -Q Quiet. Do not show program status on screen. -Z Do not insert a Ctrl-Z EOF at end of each output file. -J Pause for a keystroke between chops. ============================== Terminology and other notes: ============================== The term "computed break point" means the place in the file where the split would normally occur, if CHOP was not doing something special about return characters. For example, if you have a file of 100,000 bytes that is being split into 5 parts, the computed break points are at these bytes: 20,000 40,000 60,000 80,000 You can force the breaks to occur exactly at these points by using the -X (exact) switch on the command line. Or, you can let CHOP try to find a logical breaking point in the file (normally a carriage return / line feed). The term "return string" or "return characters" means the sequence of one or more characters that defines a newline, or some other interesting boundary in the file. CHOP assumes that you would prefer to split the file at a natural boundary, rather than just someplace in the middle. By default, CHOP assumes the break should occur at a carriage return / line feed character sequence (CR LF -- Hex 0D 0A). Thus, if CHOP plans to break the file at the 1000th byte, it will actually look a little ahead of byte 1000 to try to find a newline (CR LF) and split it there, rather than at byte 1000. You can redefine the return string to be something else. For example, Compuserve messages begin with the "#:" character sequence. By defining this sequence as the return string, you instruct CHOP to split the file only between messages--no message would be split across CHOP output files. You would define "#:" as the return string by using the switch "-n#:" on the command line (see examples). CHOP ordinarily splits a file after the return string. You can make the split happen before the return string by using the -h switch. You would probably want to use this switch in the preceding Compuserve example since the "#:" characters mark the beginning of a message. You would probably want them to be the first characters in a new file, rather than the last characters in the preceding file. You can also limit how far CHOP is willing to search for the return string. The -L parameter determines how many bytes forward of the computed break point CHOP looks for the return. The default is 1000 bytes. If it cannot find the return string within the number of bytes specified with -L, CHOP breaks the file at the computed point. CHOP never breaks a file before the computed point. As a consequence, the last CHOP output file will typically be a little smaller than the earlier ones: the differences between the computed break points and the actual return boundaries mount up. The first byte of a file is byte 0 or byte 1. The second byte is always byte 2. In other words, CHOP always counts from 1. if you specify byte 0, it assumes you mean the beginning byte. CHOP will ordinarily decline to overwrite any existing files, but will display a message and halt instead. If you want to trample over existing files (I had to use "trample"--T was the only letter left), use the -t switch on the command line. If -t is specified, CHOP will write over any files that get in its way. Use the -g switch to change the beginning file number. For example, if you want to chop FOO.BAR into several pieces, but you want the first one to be numbered FOO.8, use the -g8 switch to set the starting number. ============================== Examples: ============================== CHOP foo.bar chops FOO.BAR into two pieces, FOO.1 and FOO.2. If you do not use either the -P or -S switches, CHOP assumes you want to split the file into two pieces. CHOP foo.bat -p5 chops the foo.bat program into five files of approximately equal size. The breaks take place after a CR/LF pair. CHOP foo.bat -s2000 -x chops foo.bat into 2000-byte pieces. The first chop occurs exactly at byte 2000. Note that the output file will actually have 2001 bytes, counting the control-Z added to the file (though you can suppress it with the -z switch). CHOP foo.bat -e2000 copies the first 2000 bytes of foo.bat to FOO.1. By default, copying begins at the first byte of the file. CHOP foo.bat -e2000 -p2 puts the first 2000 bytes of foo.bat into 2 files of about 1000 bytes each. CHOP foo.bat -b3000 -e3999 -p2 -r puts the 1000 bytes in foo.bat between byte 3000 and 3999 into 2 files. The chop will occur at the first CR/LF pair after byte 3500. CHOP foo.bat -s20000 -nMSG: -oD:\CHOPOUT -g5 -c chops foo.bat into pieces of approximately 20,000 bytes. The first chop will occur immediately after the first occurrence of the string "MSG:" (case sensitive because of -c) after byte 20000. The output files will be D:\CHOPOUT\FOO.5, FOO.6, etc. CHOP -id:\pdq\cserv.thd -p10 -n#: -h -l2000 chops d:\pdq\cserv.thd into about 10 pieces, with the breaks occurring immediately before the character string "#:" (used by Compuserve forum software to designate a new message). CHOP will search up to 2000 bytes past the computed break point for the "#:" string. If it cannot find "#:" within 2000 bytes, it will give up and break at the computed point. CHOP cserv.thd -h -n#: -w -t chops cserv.thd into many files--one for each occurrence of the "#:" string. The -w switch implies an unlimited search limit and overrides the -P, -S and -X switches. Overwrite any files (CSERV.1, CSERV.2, etc.) that are already there. CHOP program.pas -nPROCEDURE -w -h chops the program.pas Pascal file into many files--one for each procedure. After executing this command, each of the procedures in "program.pas" will be in a separate file. CHOP foo.bar -n$0C -p3 This example shows how to put hex codes in the "return" string. The codes must be exactly two digits (i.e., precede single digit hex codes with a 0) and must be preceded by a dollar sign. This example causes foo.bar to be chopped into three pieces, with the chop taking place at a hex 0C character (ASCII decimal 12, or formfeed). FOR %1 in (*.TXT) DO CHOP %1 -p4 This example illustrates how to use the batch FOR command to CHOP a series of files as specified by the wildcard (*.TXT). In this case, each .TXT file will be chopped into four pieces of approximately equal size. ============================== Rejoining Chopped Files ============================== You can use the DOS copy command to rejoin chopped files. For example, this command rejoins two text files--FOO.1 and FOO.2--to recreate the original FOO.BAR file. COPY FOO.1/a+FOO.2 FOO.BAR If FOO.BAR, FOO.1 and FOO.2 are binary files, use the /b switch with the COPY command: COPY FOO.1/b+FOO.2 FOO.BAR The /b switch causes DOS to treat control-Z characters as legitimate data (instead of EOF marks) when the files are joined. ============================== Support ============================== If you have any problems with CHOP or any suggestions about how to improve it, please contact me on Compuserve (PPN 74025,514) or write to me at 1801 E. 12th., Apt 1118, Cleveland, OH 44114.