Call by reference in R（ただし見せかけの） - 元データ分析の会社で働いていた人の四方山話

みなさん大好きななかなか興味深い記事があがっていたので、英語の練習がてら書いてみました。
内容はともかく、語彙の正確性は保証しかねますので・・・
色々読んではいるので、時々アウトプットしていこうかな、と思う次第です。

原文

Call by reference in R | (R news & tutorials)
正確にいうとこっちですかね。
https://rlearner.wordpress.com/2011/09/11/call-by-reference-in-r/

書きなぐりの対訳

Sometimes it is convenient to use “call by reference evaluation” inside an R function. For example, if you want to have multiple return value for your function, then either you return a list of return value and split them afterward or you can return the value via the argument.

時にはRの関数内で"参照渡し"を使うと便利ある。例えば自作の関数に複数の返り値を持たせたい時は、それらの返り値のリストを返してその後分割することや引数を通じてその値を返すことができる。

For some reasons(I would like to know too), R do not support call by reference. The first reason come up in my mind is safety, if the function can do call by reference, it is more difficult to trace the code and debug(you have to find out which function change the value of your variables by examining the details of your function). In fact, R do “call by reference” when the value of the argument is not changed. They will make a copy of the argument only when the value is changed. So we can expect there’s no efficiency gain (at least not a significant one) even we can do call by reference.

いくつかの理由のために(僕も知りたい)、Rは参照渡しをサポートしていない。思いつく一つ目の理由は安全性である、もし参照渡しが行えたとすると、コードをトレースしデバッグすること（自作関数の詳細を調べる事で変数の値がどの関数で変化したか調べなければならない）をより困難にする。
実際のところ、Rは、引数の値が変化しない時、"参照渡し"を行う。引数の値が変化する時のみ、引数のコピーを作成するだろう。そこで、参照渡しが行えたとしても、効果的な利点(少なくとも重要な利点)はないと予想できる。

Anyway, it is always good to know how to have a “pseudo call by reference” in R (you can choose (not) to use it for whatever reason). The trick to implement call by reference is to make use of the eval.parent function in R. You can add a code to replace the argument value in the parent environment so that the function looks like implementing the call by reference evaluation strategy. Here are some examples of how to do it:

とにかく、(どのような理由によって使用する（もしくは使用しない）事を選択できる)Rにおいて、どのように"見かけの参照渡し"を実現するかを知る事は良い事えある。参照渡しを実現するためのトリックは、Rのeval.parent関数を使うことである。ある関数が参照渡しを実装しているように見せるために、親環境で引数の値を置き換えるようなコードを加える事ができる。どのようにするかの例を示す。

set<-function(x,value){
   eval.parent(substitute(x<-value))
}
valX <- 51
set(valX ,10)
valX
>[1] 10
addOne_1<-function(x, value){
   eval.parent(substitute(x<-x+1))
}
valX <- 51
addOne_1(valX)
valX
>[1] 52

ちなみに上のaddOne()は第２引数いらないですね。他と合わせるために書いてるんでしょうか。

Note that you could not change the value of x inside the function. If you change the value of x, a new object will be created. The substitute function will replace x with the new value and hence this method wont work. For example

関数の内部でxの値を変更できなかったことに注意してください。xの値を変更すると、新しいオブジェクトが作成されます。substitute関数はxを新しい値と置き換ええて、この関数はうまく動かないでしょう。例えば、

addOne_2<-function(x,value){
   x<-x+1
   eval.parent(substitute(x<-x))
}
valX <- 51
addOne_2(valX)
>Error in 52 <- 52 : invalid (do_set) left-hand side to assignment

If you want to change the value of x inside the function, you have to copy x to a new object and use new object as x. At the end of the function, you can replace the value of x with the new object at the parent environment.

関数内でxの値を変更したければ、新しいオブジェクトへxをコピーして、xとして新しいオブジェクトを使用しなければならない。関数の最後に、親環境でxと新しいオブジェクトを置き換えることができる。

addOne_3<-function(x,value){
   xx<-x
   xx<-xx+1
   eval.parent(substitute(x<-xx))
}
valX <- 51
addOne_3(valX)
valX
>[1] 52

Another way to do call by reference more formally is using the R.oo packages.

参照渡しをもっとフォーマルに行うための別の方法は、R.ooパッケージを使う事である。

所感

ここまでして参照渡しを実現させる必要は今のところ感じません。
とはいえ、かなり大きなデータを使う時は関数を書くのも憚れるのも事実。
bigmemoryを使うか、早くR-2.14が登場するのを待つほかありません。

環境とか真面目に勉強したことないので、勉強しようかなと思います。