最近对S语言比较感兴趣,于是开始学习S语言,相当于是对于R语言的一个补充

S语言的知识

变量和函数

S语言的变量可以用variable直接进行定义,例如

variable x,y,z;

S语言的变量不需要提前声明,通过等式可以直接定义,例如

x=3,y=sin(5.6),z="I think, therefore i am"

S语言的函数可以用define直接进行定义,例如

define compute_average (x,y)
{
  variable s =x +y;
  return s/2.0
}

qualifier

通过qualifier为一个函数增加修饰成分,例如

define plot (x, y)
    {
        variable linestyle = qualifier ("linestyle", "solid");
        variable color = qualifier ("color", "black");
        sys_set_color (color);
        sys_set_linestyle (linestyle);
        sys_plot (x,y);
    }

其中linestylecolor是修饰成分

strings

S语言可以对字符操作,而不考虑给字符分配内存空间,其中strcat可以用+代替,例如:

define concat_3_strings (a, b, c)
{
    return strcat (a, b, c);
}

在相同情况下,C语言需要写下面的语句:

char *concat_3_strings (char *a, char *b, char *c)
{
    unsigned int len;
    char *result;
    len = strlen (a) + strlen (b) + strlen (c);
    if (NULL == (result = (char *) malloc (len + 1)))
    exit (1);
    strcpy (result, a);
    strcat (result, b);
    strcat (result, c);
    return result;
}

Referencing and Dereferencing

和其他大多数语言一样,用&可以用来应用其他object或者是function,例如:

define compute_functional_sum (funct)
{
    variable i, s;
    s = 0;
    for (i = 0; i < 10; i++)
    {
    s += (@funct)(i);
    }
    return s;
}
variable sin_sum = compute_functional_sum (&sin);
variable cos_sum = compute_functional_sum (&cos);

解引用,用符号@,类似于C语言中的指针,例如:

define set_xyz (x, y, z)
{
    @x = 1;
    @y = 2;
    @z = 3;
}
variable X, Y, Z;
set_xyz (&X, &Y, &Z);

其中C语言的样式

void set_xyz (int *x, int *y, int *z)
{
    *x = 1;
    *y = 2;
    *z = 3;
}

Arrays

S语言支持多维数组,例如:

variable A = Int_Type [10];  # 创建10个整数的一维数组
variable B = Int_Type [10, 3]; # 创建10*3整数的二维数组
variable C = [1, 3, 5, 7, 9];  #直接生成数组
variable D=[1:9:2]; #创建以2为间距,从1到9的数组
variable E=[0:1:#1000]; #创建1000个0到1的浮点数

Arrays类型变量可以直接在函数中使用:

define init_array (a)
{
    variable i, imax;
    imax = length (a);
    for (i = 0; i < imax; i++)
    {
        a[i] = 7;
    }
}
variable A = Int_Type [10];
init_array (A);

Arrays的类型

S语言与C语言之间的对比

variable X, Y; X = [0:2*PI:0.01]; Y = 20 * sin (X);

double *X, *Y; 
unsigned int i, n;
 n = (2 * PI) / 0.01 + 1; 
 X = (double *) malloc (n * sizeof (double));
 Y = (double *) malloc (n * sizeof (double)); 
 for (i = 0; i < n; i++)
 { 
     X[i] = i * 0.01; 
     Y[i] = 20 * sin (X[i]);
 }

Lists

List类型与array相似,但是List类型中的元素可以不同

struct

C相似的是,S语言之中也有结构体

variable person = struct {
first_name, last_name, age };
variable bill = @person;
bill.first_name = "Bill";
bill.last_name = "Clinton"; 
bill.age = 51;

一般直接用typedef来直接定义结构体

 typedef struct {
 first_name, last_name, age }
 Person_Type;
 variable bill = @Person_Type; 
 bill.first_name = "Bill"; 
 bill.last_name = "Clinton"; b
 ill.age = 51;

定义的结构体可以直接生成array类型的结构体,如下所示:

People = Person_Type [100]; 
People[0].first_name = "Bill"; 
People[1].first_name = "Hillary";

对于结构体类型,可以直接用Struct_Type来定义:

People = Struct_Type [100]; 
People[0] = @person; 
People[0].first_name = "Bill"; 
People[1] = @person;
 People[1].first_name = "Hillary";

结构体初始化函数

可以用下面的代码进行初始化

define create_person (first, last, age) {
     variable person = @Person_Type; 
     person.first_name = first;
     person.last_name = last;
     person.age = age;
     return person;
 } 
 variable Bill = create_person ("Bill", "Clinton", 51);

后缀(suffixes)

以下句子表达同一个意思

file = "C:\\windows\\apps\\slrn.rc"; 
file = "C:\\windows\\apps\\slrn.rc"Q; 
file = "C:\windows\apps\slrn.rc"R; 
file = `C:\windows\apps\slrn.rc`; % slang-2.2 and above

Null_Type

define add_numbers (a, b) {
    if (a == NULL) a = 0; if (b == NULL) b = 0; 
    return a + b;
} 
 variable c = add_numbers (1, 2); 
 variable d = add_numbers (1, NULL); 
 variable e = add_numbers (1,);
 variable f = add_numbers (,);

Ref_Type

sin_ref = &sin;
y = (@sin_ref) (1.0);

其他容器类型

DataType_Type

对于数据类型,S语言中有很多类型

类型 表达式
signed character Char_Type
unsigned character UChar_Type
short integer Short_Type
unsigned short integer UShort_Type
Plain integer Integer_Type
plain unsigned integer UInteger_Type
long integer Long_Type
unsigned long integer ULong_Type
long long integer LLong_Type
single precision real Float_Type
double precision rea Double_Type
complex numbers Complex_Type
strings,C strings String_Type
binary strings BString_Type
structures Struct_Type
references Ref_Type
NULL Null_Type
Arrays Array_Type
associative arrays/hashes Assoc_Type
Lists List_Type
DataType DataType_Type

数据类型的转换

对于数据类型的转换,S语言可以用typecast进行转换

variable x=10,y;
y=typecast (x, Double_Type);

Statement

一般来说,statement一般是由expressions组成

function

编译器中含有两种函数:intrinsic functionsslang functions,函数的声明形式:

define function-name (parameter-list ) { statement-list }

函数得引用与解引用:

define derivative (f, x)
{
    variable h = 1e-6;
    return ((@f)(x+h) - (@f)(x)) / h;
}
define x_squared (x)
{
    return x^2;
}
dydx = derivative (&x_squared, 3);

对于S语言中的函数,可以先不指定输入参数,在函数中可增加输入参数:

define add_10 ()
{
    variable x;
    x = ();
    return x + 10;
}
variable s = add_10 (12); % ==> s = 22;

所以函数也可以这么定义:

define function_name ()
{
    variable x, y, ..., z;
    z = ();
    .
    .
    y = ();
    x = ();
    .
    .
}

平均数的求法:

define average_n (n)
{
    variable s, x;
    s = 0;
    loop (n)
    {
        x = (); % get next value from stack
        s += x;
    }
    return s / n;
}

如果不知道输入数据的多少,可以用_NARGS来查看:

define average_n ()
{
    variable x, s = 0;
    if (_NARGS == 0)
    usage ("ave = average_n (x, ...);");
    loop (_NARGS)
    {
        x = ();
        s += x;
    }
    return s / _NARGS;
}

其中EXIT_BLOCK

NameSpace

其中Namespace的变量类型有private,publicstatic,在namespace中增加foo.sl

evalfile("foo.sl","foo")

% foo.sl
variable X = 1;
variable Y;
private variable Z;
public define set_Y (y) { Y = y; }
define set_z (z) { Z = z; }

Arrays

直接通过Array_Type来定义数组:

variable a = @Array_Type (data-type , integer-array );

有时候为了将一维的数组重构,变成二维的数组,可以用reshape函数进行变换:

reshape (array-name, integer-array)

举个例子来说:

varaible a = Double_Type [100]; reshape(a, [10, 10];

_reshape类似于reshape函数,它会创建一个新的数组,而不是改变原有数组的形状

Associative Arrays

S语言中Assoc_Type类型与C++中的vector类型比较相似,也有点类似于Python中的字典 创建Assoc_Type类型变量,可以用以下声明:

Assoc_Type [type ]; Assoc_Type [type , default-value ] ; Assoc_Type []
A = Assoc_Type [Int_Type];
A["alpha"] = 1;
A["beta"] = 2;
A["gamma"] = 3;

Type没有指定的时候,它可以储存任何类型的变量。

对于associative arrays,有以下几种函数对其进行操作:

一个计算词频的程序:

a = Assoc_Type [Int_Type];
foreach word (word_list)
{
    if (0 == assoc_key_exists (a, word))
    a[word] = 0;
    a[word]++; % same as a[word] = a[word] + 1;
}

Structures and User-Defined Types

结构体定义:

t = @Struct_Type ("city_name", "population", "next");
t = @Struct_Type (["city_name", "population", "next"]);

对于结构体中的元素,可以使用dot来访问.

Linked Lists

通过在结构体中增加next t = struct { city_name, population, next };

其中可以用下面函数:

define create_population_list ()
{
    variable city_name, population, list_root, list_tail;
    variable next;
    list_root = NULL;
    while (read_next_city (&city_name, &population))
    {
        next = struct {city_name, population, next };
        next.city_name = city_name;
        next.population = population;
        next.next = NULL;
        if (list_root == NULL)
        list_root = next;
        else
        list_tail.next = next;
        list_tail = next;
    }
}
return list_root;

使用typedef可以定义新的类型

操作符的重载

使用__add_binary可以进行操作符的重载,例如:

  1. 定义结构体

typedef struct { x, y, z } Vector_Type;

  1. 定义初始化

     define vector_new (x, y, z)
     {
         variable v = @Vector_Type;
         v.x = double(x); v.y = double(y); v.z = double(z);
         return v;
     }
    
  2. 定义加的操作

         define vector_add (v1, v2)
         {
             return vector_new (v1.x+v2.x, v1.y+v2.y, v1.z+v2.z);
         }
    
  3. 对变量进行操作

         V1 = vector_new (2,3,4);
         V2 = vector_new (6,2,1);
         V3 = vector_new (-3,1,-6);
         V4 = vector_add (V1, vector_add (V2, V3));
    

但是有时候,我们想直接用+进行表示,这个时候可以用__add_binary来定义:

__add_binary (op , result-type , funct , typeA ,typeB );

例如:

define vector_minus (v1, v2)
{
    return vector_new (v1.x-v2.x, v1.y-v2.y, v1.z-v2.z);
}
__add_binary ("-", Vector_Type, &vector_minus, Vector_Type, Vector_Type);
define vector_eqs (v1, v2)
{
    return (v1.x==v2.x) and (v1.y==v2.y) and (v1.z==v2.z);
}
__add_binary ("==", Char_Type, &vector_eqs, Vector_Type, Vector_Type);

Lists

对于列表的操作,可以用list_insertlist_append进行操作,例如

list_insert(list,obj,nth)list_append(list,obj,nth)

也可以直接使用list = {"hi", list};在头部增加元素,list_delete(list,2)即可以删除list中的一个元素。

list_poplist_delete不同的地方在于,list_pop返回的是list的元素。

debug过程

exit

if (-1 == write_to_file ("/tmp/foo", "bar"))
{
    () = fprintf (stderr, "Write failed\n");
    exit (1);
}

Exceptions

define write_to_file (file, str)
{
    variable fp = fopen (file, "w");
    if (fp == NULL)
    throw OpenError;
    if (-1 == fputs (str, fp))
    throw WriteError;
    if (-1 == fclose (fp))
    throw WriteError;
}

try-catch

try
{
    write_to_file ("/tmp/foo", "bar");
}
catch OpenError:
{
    message ("*** Warning: failed to open /tmp/foo.");
}
next_statement;

error的类型

    AnyError
    OSError
    MallocError
    ImportError
    ParseError
    SyntaxError
    DuplicateDefinitionError
    UndefinedNameError
    RunTimeError
    InvalidParmError
    TypeMismatchError
    UserBreakError
    StackError
    StackOverflowError
    StackUnderflowError
    ReadOnlyError
    VariableUnitializedError
    NumArgsError
    IndexError
    UsageError
    ApplicationError
    InternalError
    NotImplementedError
    LimitExceededError
    MathError
    DivideByZeroError
    ArithOverflowError
    ArithUnderflowError
    DomainError
    IOError
    WriteError
    ReadError
    OpenError
    DataError
    UnicodeError
    InvalidUTF8Error
    UnknownError

对于新建的error类型

new_exception (exception-name , baseclass , description );

文件操作

loading files: evalfile,autoload and require

模块module

通过import来引入模块,例如import ("pcre");

更好的是使用require来引入模块,例如require ("pcre");

与C语言的域操作::不同的是,S语言中的域操作是->

S语言中的文件操作

例子如下:

define count_lines_in_file (file)
{
    variable fp, line, count;
    fp = fopen (file, "r"); % Open the file for reading
    if (fp == NULL)
    throw OpenError, "$file failed to open"$;
    count = 0;
    while (-1 != fgets (&line, fp))
    count++;
    () = fclose (fp);
    return count;
}