Monday, March 20, 2017

A better replication function:rep() for R

This article is to propose a new approach to the R replication function rep(). At this point, I am not fluent in creating R packages and would not, for a while, create the package even though I will try to provide the code I have in mind.

First of all, let's review the current implementation of the R replication function rep() in my own words:
  rep(Vctr,times=Vctr1,each=n,length.out=N)
Each element in Vctr is repeat n times if Vctr1 do not exist. Otherwise, the times each element is repeated is specified by Vctr1. However, if times is a number, the Vctr is repeated that number of times. length.out disregard times, each element is repeated n times and repeated again until length.out is reached. 
From my description above, I see that when 'times' is a vector, its elements controls how the corresponding element in Vctr are repeated. But not when 'times' is degraded to a number - it then control the number of times the 'Vctr' is repeated. On the other hand, the 'each', as a number, it also controls the number of times each element in Vctr are repeated. With these info, it just logical for me to want to reconsider the situation when 'times' is just a number. It just seems more logical to me to consider this as a special case where all elements in Vctr are to be repeated the same amount of 'times'. i.e.
    Vctr1= 5 := c(5, 5, 5, ...).
With this equivalency, I also like to propose the switch the meaning of  'times' and 'each' so that the 'each' now describes how many times each of the elements in Vctr should be repeated. With the meaning of the new 'each' been settled, we should now reconsider the meaning of the new 'times'.

The newly proposed meaning for 'times' will be the times to repeat the sequence generated by the 'each'. The new meaning for length.out would be designated as the maximum length of the eventual output.

With the proposed changes outlined above, here are few examples to demonstrate the new new-rep() function.

>new-rep(1:5,each=2,times=1,length.out=13)
[1] 1 1 2 2 3 3 4 4 5 5

>new-rep(1:5,each=2,times=2,length.out=13)
[1] 1 1 2 2 3 3 4 4 5 5 1 1 2

>new-rep(1:5,each=c(1,2,3,2,1),times=1,length.out=13)
[1] 1 2 2 3 3 3 4 4 5

>new-rep(1:5,each=c(1,2,3,2,1),times=2,length.out=13)
[1] 1 2 2 3 3 3 4 4 5 1 2 2 3

Can we create all that can be done with the old rep() function? - yes!
Can we create something that is not possible to create with the old rep() function? - yes

Possible algorithm:
new-rep <- function (Vctr,each,times,length.out) {
    if length(each)==1 { # if each exist and just a number
        each <- rep(each,times=length(Vctr))  # expand each
    }
    Rslt <- rep(Vctr, times=each);
    Rslt <- rep(Rslt, times=times);
    Lngth <- length(Rslt);
    if (is.numeric(length.out))  {
     if (length.out&tl;Lngth) Lngth <- length.out;
    }
    Rslt[1:Lngth];
}
Appreciate your thoughts and possible creation of package.