S语言的知识

最近对S语言比较感兴趣，于是开始学习S语言，相当于是对于R语言的一个补充

S语言的知识

变量和函数

S语言的变量可以用variable直接进行定义，例如

variable x,y,z;

S语言的变量不需要提前声明，通过等式可以直接定义，例如

x=3,y=sin(5.6),z="I think, therefore i am"

S语言的函数可以用define直接进行定义，例如

define compute_average (x,y)
{
  variable s =x +y;
  return s/2.0
}

qualifier

通过qualifier为一个函数增加修饰成分，例如

define plot (x, y)
    {
        variable linestyle = qualifier ("linestyle", "solid");
        variable color = qualifier ("color", "black");
        sys_set_color (color);
        sys_set_linestyle (linestyle);
        sys_plot (x,y);
    }

其中linestyle和color是修饰成分

strings

S语言可以对字符操作，而不考虑给字符分配内存空间，其中strcat可以用+代替，例如：

define concat_3_strings (a, b, c)
{
    return strcat (a, b, c);
}

在相同情况下，C语言需要写下面的语句：

char *concat_3_strings (char *a, char *b, char *c)
{
    unsigned int len;
    char *result;
    len = strlen (a) + strlen (b) + strlen (c);
    if (NULL == (result = (char *) malloc (len + 1)))
    exit (1);
    strcpy (result, a);
    strcat (result, b);
    strcat (result, c);
    return result;
}

Referencing and Dereferencing

和其他大多数语言一样，用&可以用来应用其他object或者是function，例如：

define compute_functional_sum (funct)
{
    variable i, s;
    s = 0;
    for (i = 0; i < 10; i++)
    {
    s += (@funct)(i);
    }
    return s;
}
variable sin_sum = compute_functional_sum (&sin);
variable cos_sum = compute_functional_sum (&cos);

解引用，用符号@，类似于C语言中的指针，例如：

define set_xyz (x, y, z)
{
    @x = 1;
    @y = 2;
    @z = 3;
}
variable X, Y, Z;
set_xyz (&X, &Y, &Z);

其中C语言的样式

void set_xyz (int *x, int *y, int *z)
{
    *x = 1;
    *y = 2;
    *z = 3;
}

Arrays

S语言支持多维数组，例如：

variable A = Int_Type [10];  # 创建10个整数的一维数组
variable B = Int_Type [10, 3]; # 创建10*3整数的二维数组
variable C = [1, 3, 5, 7, 9];  #直接生成数组
variable D=[1:9:2]; #创建以2为间距，从1到9的数组
variable E=[0:1:#1000]； #创建1000个0到1的浮点数

Arrays类型变量可以直接在函数中使用：

define init_array (a)
{
    variable i, imax;
    imax = length (a);
    for (i = 0; i < imax; i++)
    {
        a[i] = 7;
    }
}
variable A = Int_Type [10];
init_array (A);

Arrays的类型

Double_Type : double
Complex_Type: complex
String_Type : string
Ref_Type : reference
Any_Type

S语言与C语言之间的对比

variable X, Y; X = [0:2*PI:0.01]; Y = 20 * sin (X);

double *X, *Y; 
unsigned int i, n;
 n = (2 * PI) / 0.01 + 1; 
 X = (double *) malloc (n * sizeof (double));
 Y = (double *) malloc (n * sizeof (double)); 
 for (i = 0; i < n; i++)
 { 
     X[i] = i * 0.01; 
     Y[i] = 20 * sin (X[i]);
 }

Lists

List类型与array相似，但是List类型中的元素可以不同

struct

与C相似的是，S语言之中也有结构体

variable person = struct {
first_name, last_name, age };
variable bill = @person;
bill.first_name = "Bill";
bill.last_name = "Clinton"; 
bill.age = 51;

一般直接用typedef来直接定义结构体

 typedef struct {
 first_name, last_name, age }
 Person_Type;
 variable bill = @Person_Type; 
 bill.first_name = "Bill"; 
 bill.last_name = "Clinton"; b
 ill.age = 51;

定义的结构体可以直接生成array类型的结构体，如下所示：

People = Person_Type [100]; 
People[0].first_name = "Bill"; 
People[1].first_name = "Hillary";

对于结构体类型，可以直接用Struct_Type来定义：

People = Struct_Type [100]; 
People[0] = @person; 
People[0].first_name = "Bill"; 
People[1] = @person;
 People[1].first_name = "Hillary";

结构体初始化函数

可以用下面的代码进行初始化

define create_person (first, last, age) {
     variable person = @Person_Type; 
     person.first_name = first;
     person.last_name = last;
     person.age = age;
     return person;
 } 
 variable Bill = create_person ("Bill", "Clinton", 51);

后缀(suffixes)

R：反斜杠线不会被转义
Q：反斜杠线会被转义
B：字符串将会被翻译成二进制字符串
$：将字符串中的变量读出

以下句子表达同一个意思

file = "C:\\windows\\apps\\slrn.rc"; 
file = "C:\\windows\\apps\\slrn.rc"Q; 
file = "C:\windows\apps\slrn.rc"R; 
file = `C:\windows\apps\slrn.rc`; % slang-2.2 and above

Null_Type

define add_numbers (a, b) {
    if (a == NULL) a = 0; if (b == NULL) b = 0; 
    return a + b;
} 
 variable c = add_numbers (1, 2); 
 variable d = add_numbers (1, NULL); 
 variable e = add_numbers (1,);
 variable f = add_numbers (,);

Ref_Type

sin_ref = &sin;
y = (@sin_ref) (1.0);

其他容器类型

Array_Type
Assoc_Type
List_Type
Struct_Type

DataType_Type

对于数据类型，S语言中有很多类型

类型	表达式
signed character	Char_Type
unsigned character	UChar_Type
short integer	Short_Type
unsigned short integer	UShort_Type
Plain integer	Integer_Type
plain unsigned integer	UInteger_Type
long integer	Long_Type
unsigned long integer	ULong_Type
long long integer	LLong_Type
single precision real	Float_Type
double precision rea	Double_Type
complex numbers	Complex_Type
strings,C strings	String_Type
binary strings	BString_Type
structures	Struct_Type
references	Ref_Type
NULL	Null_Type
Arrays	Array_Type
associative arrays/hashes	Assoc_Type
Lists	List_Type
DataType	DataType_Type

数据类型的转换

对于数据类型的转换，S语言可以用typecast进行转换

variable x=10,y;
y=typecast (x, Double_Type);

Statement

一般来说，statement一般是由expressions组成

function

编译器中含有两种函数：intrinsic functions和slang functions，函数的声明形式：

define function-name (parameter-list ) { statement-list }

函数得引用与解引用：

define derivative (f, x)
{
    variable h = 1e-6;
    return ((@f)(x+h) - (@f)(x)) / h;
}
define x_squared (x)
{
    return x^2;
}
dydx = derivative (&x_squared, 3);

对于S语言中的函数，可以先不指定输入参数，在函数中可增加输入参数：

define add_10 ()
{
    variable x;
    x = ();
    return x + 10;
}
variable s = add_10 (12); % ==> s = 22;

所以函数也可以这么定义：

define function_name ()
{
    variable x, y, ..., z;
    z = ();
    .
    .
    y = ();
    x = ();
    .
    .
}

平均数的求法：

define average_n (n)
{
    variable s, x;
    s = 0;
    loop (n)
    {
        x = (); % get next value from stack
        s += x;
    }
    return s / n;
}

如果不知道输入数据的多少，可以用_NARGS来查看：

define average_n ()
{
    variable x, s = 0;
    if (_NARGS == 0)
    usage ("ave = average_n (x, ...);");
    loop (_NARGS)
    {
        x = ();
        s += x;
    }
    return s / _NARGS;
}

其中EXIT_BLOCK

NameSpace

其中Namespace的变量类型有private,public和static，在namespace中增加foo.sl：

evalfile("foo.sl","foo")

% foo.sl
variable X = 1;
variable Y;
private variable Z;
public define set_Y (y) { Y = y; }
define set_z (z) { Z = z; }

Arrays

直接通过Array_Type来定义数组：

variable a = @Array_Type (data-type , integer-array );

有时候为了将一维的数组重构，变成二维的数组，可以用reshape函数进行变换：

reshape (array-name, integer-array)

举个例子来说：

varaible a = Double_Type [100]; reshape(a, [10, 10];

_reshape类似于reshape函数，它会创建一个新的数组，而不是改变原有数组的形状

Associative Arrays

S语言中Assoc_Type类型与C++中的vector类型比较相似，也有点类似于Python中的字典创建Assoc_Type类型变量，可以用以下声明:

Assoc_Type [type ]; Assoc_Type [type , default-value ] ; Assoc_Type []
A = Assoc_Type [Int_Type];
A["alpha"] = 1;
A["beta"] = 2;
A["gamma"] = 3;

当Type没有指定的时候，它可以储存任何类型的变量。

对于associative arrays，有以下几种函数对其进行操作：

assoc_get_keys：返回 keys of the array
assoc_get_values：返回 values of the array
assoc_key_exists：是否存在该 key
assoc_delete_key: 删除该key

一个计算词频的程序：

a = Assoc_Type [Int_Type];
foreach word (word_list)
{
    if (0 == assoc_key_exists (a, word))
    a[word] = 0;
    a[word]++; % same as a[word] = a[word] + 1;
}

Structures and User-Defined Types

结构体定义：

t = @Struct_Type ("city_name", "population", "next");
t = @Struct_Type (["city_name", "population", "next"]);

对于结构体中的元素，可以使用dot来访问.

Linked Lists

通过在结构体中增加next t = struct { city_name, population, next };

其中可以用下面函数：

define create_population_list ()
{
    variable city_name, population, list_root, list_tail;
    variable next;
    list_root = NULL;
    while (read_next_city (&city_name, &population))
    {
        next = struct {city_name, population, next };
        next.city_name = city_name;
        next.population = population;
        next.next = NULL;
        if (list_root == NULL)
        list_root = next;
        else
        list_tail.next = next;
        list_tail = next;
    }
}
return list_root;

使用typedef可以定义新的类型

操作符的重载

使用__add_binary可以进行操作符的重载，例如：

定义结构体

typedef struct { x, y, z } Vector_Type;

定义初始化

 define vector_new (x, y, z)
 {
     variable v = @Vector_Type;
     v.x = double(x); v.y = double(y); v.z = double(z);
     return v;
 }

定义加的操作

     define vector_add (v1, v2)
     {
         return vector_new (v1.x+v2.x, v1.y+v2.y, v1.z+v2.z);
     }

对变量进行操作

     V1 = vector_new (2,3,4);
     V2 = vector_new (6,2,1);
     V3 = vector_new (-3,1,-6);
     V4 = vector_add (V1, vector_add (V2, V3));

但是有时候，我们想直接用+进行表示，这个时候可以用__add_binary来定义：

__add_binary (op , result-type , funct , typeA ,typeB );

例如：

define vector_minus (v1, v2)
{
    return vector_new (v1.x-v2.x, v1.y-v2.y, v1.z-v2.z);
}
__add_binary ("-", Vector_Type, &vector_minus, Vector_Type, Vector_Type);
define vector_eqs (v1, v2)
{
    return (v1.x==v2.x) and (v1.y==v2.y) and (v1.z==v2.z);
}
__add_binary ("==", Char_Type, &vector_eqs, Vector_Type, Vector_Type);

Lists

对于列表的操作，可以用list_insert和list_append进行操作，例如

list_insert(list,obj,nth)和list_append(list,obj,nth)

也可以直接使用list = {"hi", list};在头部增加元素，list_delete(list,2)即可以删除list中的一个元素。

list_pop与list_delete不同的地方在于,list_pop返回的是list的元素。

debug过程

exit

if (-1 == write_to_file ("/tmp/foo", "bar"))
{
    () = fprintf (stderr, "Write failed\n");
    exit (1);
}

Exceptions

define write_to_file (file, str)
{
    variable fp = fopen (file, "w");
    if (fp == NULL)
    throw OpenError;
    if (-1 == fputs (str, fp))
    throw WriteError;
    if (-1 == fclose (fp))
    throw WriteError;
}

try-catch

try
{
    write_to_file ("/tmp/foo", "bar");
}
catch OpenError:
{
    message ("*** Warning: failed to open /tmp/foo.");
}
next_statement;

error的类型

    AnyError
    OSError
    MallocError
    ImportError
    ParseError
    SyntaxError
    DuplicateDefinitionError
    UndefinedNameError
    RunTimeError
    InvalidParmError
    TypeMismatchError
    UserBreakError
    StackError
    StackOverflowError
    StackUnderflowError
    ReadOnlyError
    VariableUnitializedError
    NumArgsError
    IndexError
    UsageError
    ApplicationError
    InternalError
    NotImplementedError
    LimitExceededError
    MathError
    DivideByZeroError
    ArithOverflowError
    ArithUnderflowError
    DomainError
    IOError
    WriteError
    ReadError
    OpenError
    DataError
    UnicodeError
    InvalidUTF8Error
    UnknownError

对于新建的error类型

new_exception (exception-name , baseclass , description );

文件操作

loading files: evalfile,autoload and require

模块module

通过import来引入模块，例如import ("pcre");

更好的是使用require来引入模块，例如require ("pcre");

与C语言的域操作::不同的是，S语言中的域操作是->

S语言中的文件操作

fopen：打开文件
fclose：关闭文件
fgets：读取文件中的一行
fputs：写入文件
fprintf：向文件中写入格式文本
fwrite：向文件中写入对象
fread：读取文件若干行
fread_bytes：以二进制的格式读取文件
feof：是否在文件的结尾
ferror: 检查是否有错误
clearerr：清除文件读入流中的错误
fflush：将缓存区文件读出
ftell：读取当前文件读取位置
fseek：设置文件位置
fgetslines：读取文件若干行

例子如下：

define count_lines_in_file (file)
{
    variable fp, line, count;
    fp = fopen (file, "r"); % Open the file for reading
    if (fp == NULL)
    throw OpenError, "$file failed to open"$;
    count = 0;
    while (-1 != fgets (&line, fp))
    count++;
    () = fclose (fp);
    return count;
}