Currently there are three functions for manipulating strings: paste, spaste, and split. (Clearly more are needed.)
paste takes a list of values, converts them all to scalar string's, and returns their concatenation as a scalar string value. For example,
a := [2,3,5] paste( "the first three primes are", a )yields
the first three primes are [2 3 5]The []'s seen here in the string representation of the vector a only occur for a numeric value with more than one element.
Similarly,
paste( "hello", "there" )is equivalent to the string constant
'hello there'
By default, the string values are concatenated together using a single space. The optional sep= argument can be used to specify a string to use instead. For example,
paste("hello", "there", "how", "are", "you?", sep="XYZ")yields
helloXYZthereXYZhowXYZareXYZyou?
Note that the arguments to paste are first converted to scalar string's, and then concatenated together. So
paste( "hello there", 1:3, sep="" )yields
hello there[1 2 3]and not
hellothere[123]
If a single string is passed to paste, the elements of the string are concatenated. For example:
paste("a b c d e", sep='' )results in the string "abcde". With multiple strings, the individual strings are concatenated, but not the individual elements of the strings.
spaste is simply a version of paste with the separator set to an empty string. It is defined using:
func spaste(...) paste(...,sep='')This form of paste is common enough that it merits its own simple form.
split
is basically the inverse of paste. It takes a single
argument, converts it to a scalar string, and splits it into words
at each block of whitespace, just as string constants are constructed
when enclosed in double-quotes (see § 3.3.1, page ).
Thus
split('hello there how are you?')is equivalent to
"hello there how are you?"that is, it yields a five-element string vector.
You can also call split with a second argument, giving a string of characters at which it should break the string. For example,
split("hello there how are you", "eo")yields the equivalent of
['h', 'll', ' th', 'r', ' h', 'w ar', ' y', 'u']Here the first element is
'h'
, the second is 'll'
, the
third ' th'
, and so forth. The presence of the single leading space
in ' th'
may be surprising. What happened is that first split
converted
"hello there how are you"to a scalar value, equivalent to
'hello there how are you'since when the double-quoted constant was constructed all information about the number of blanks between words was lost. Next split broke the scalar into words at every occurrence of an 'e' or an 'o', but not at each blank like it would without the second argument. If a blank had been included in the second argument then these extra blanks naturally disappear:
split("hello there how are you", 'eo ')yields the equivalent of
"h ll th r h w ar y u"Note that you have to enclose the second argument in single-quotes, otherwise the blank would have been removed.
If the second argument is a null string, the first string is broken up into individual characters. Here is an example:
split("how are you", '')yields the string:
"h o w a r e y o u"note that the fourth and eighth elements are space characters.