Surrogate modelling1

Title: Smart Sampling Algorithm for Surrogate Model Development

It helps to convert a high-fidelity simulation model into a computationally inexpensive surrogate model that captures its essential features with prescribed numerical accuracy

Surrogate modelling1

Polynomial Surface Response Models (PRSM)
Radial Basis Functions
Support Vector Regression
Artificial Neural Netwroks

Sampling method

Non-adaptive: grid-based, pattern/geometry-based, stochastic and quasi-random
Adaptive:

Problem Statement

描述了一个实验或者复杂得难以计算的单元、过程或者系统。为了解决这个问题，我们选择替代来代替原有的模型。总的来说，问题如下：

寻找替代模型：
终止条件：最大采样数量或者替代精度

Motivation

Samplling methods

uniform(US)
random(RS)
Statistical
Central composite design
quasi-random low-discrepancy

Problem

How many sample points should we use
How should we generate tem
How do they affect the quality of surrogate approximation
Which sampling method gives the best approximation for given form of S(x)

Key Concepts

Normalize:
这篇文章与Cozad的文章不同之处在于，Cozad的文章主要在于寻找最优的代理模型，而这篇文章的关注点：对于给定的代理模型，怎么选取点是最佳的

Crowding Distance Metric (CDM):
This is the spartial charateristic

Departure Function:
indicates the surrogate model derived from I sample points(i=1,2,…,I)
indicates the surrogate derived from all points except
This is the quality charateristic:

Optimal Points Placement

select points which is far from the existing points
select points which has a great influence on S(x)

Define a NLP problem:2

Smart Sampling Algorithm

对于估计模型S(x)至少需要采样个数为P：

Set K = P+1 <
Generate K sampling method, shows US is the best,
compute
Using () to generate
If ，stop. Otherwise, proceed next
calculate
SolveNLP，添加p=p+1，set

example

# The example
x=c(0, 0.35, 0.47, 0.55, 0.69, 1)
y=(6*x-2)^2*sin(12*x-4)

# S(x)=a0+a1*x+a2*x^2+a3*x^3+a4*x^4
inputData<-data.frame("x"=x,"y"=y)
fit4<-lm(y~x+I(x^2)+I(x^3)+I(x^4),data=inputData)
summary(fit4)

# coefficients
coef4<-fit4$coefficients

# define a function to calculate CDM

CDMCalc<-function(x){
  CDMValue<-lapply(x,function(x_i){
    xi<-rep(x_i,length(x))
    return(sum((xi-x)^2))
  })
  return(unlist(CDMValue))
}

# Calulate CDM
CDMValue<-CDMCalc(x)
orderCDM<-order(CDMValue,decreasing = TRUE)

inputNum<-c(2,3,4,5,6)
inputData6<-inputData[inputNum,]
fit1<-lm(y~x+I(x^2)+I(x^3)+I(x^4),data=inputData6)
coef1<-fit1$coefficients
summary(fit1)

# # Symbolic Computation in R，using rsympy package
# library(Ryacas)
# t<-Sym("t")
# 
# tmatrix<-List(List(1,t,t**2,t**3,t**4))
# coef1<-List(coef1)
# dataMatrix<-tmatrix*coef1

对比文献参见文献第一页 ↩
考虑CCM的选点问题，给定长度进行选点，对于multiple spatial CCM ↩