Friday, January 31, 2020

reverse proxy in java

Several proxy server implementations are available in Java, but I found following lib is pretty useful.
https://github.com/mitre/HTTP-Proxy-Servlet

we can put the class in servlet and edit web.xml, no source code need to be changed.
using this, we can delegate pdf generation from servlet to node server.

---

Here's an example excerpt of a web.xml file to communicate to a Solr server:
<servlet>
    <servlet-name>solr</servlet-name>
    <servlet-class>org.mitre.dsmiley.httpproxy.ProxyServlet</servlet-class>
    <init-param>
      <param-name>targetUri</param-name>
      <param-value>http://solrserver:8983/solr</param-value>
    </init-param>
    <init-param>
      <param-name>log</param-name>
      <param-value>true</param-value>
    </init-param>
</servlet>
<servlet-mapping>
  <servlet-name>solr</servlet-name>
  <url-pattern>/solr/*</url-pattern>
</servlet-mapping>
--

we may further combin(or replace) this with rust's proxy based on tokio and headless chrome to generate pdf.

Puppeteer on Rust

Puppeteer  is a powerful tool to control web application at server side using headless chromium.
This is mainly used for automated testing, but one of the useful feature is generating pdf from html.

One of the  main problem of generating PDF from html on client side browser is that the layout may change when the browser is updated(Actually I only considering Chrome/chromium).

So if the PDF generation is done at server side, we can fix the specific version of chromium at server side, but client can update the browser without loosing correct layout of pdf.

This Puppeteer  can be installed by 'npm i puppeteer'
Following is the sample code for support httpserver to convert html into pdf.
This service takes parameters, and generate html string from them and create pdf from the generated html string, so no file system is used for html and pdf.(efficient)


const http = require('http');

http.createServer((request, response) => {
    //console.log('request.url: '+request.url)
    //console.log('request.method: '+request.method)
    if (request.method === 'POST' && request.url === '/nodejs/template-pdf-gen') {
      var ctx = null;
      request.on('data', (chunk) => {
        ctx = JSON.parse(chunk);
      }).on('end', async () => {
        const browser = await puppeteer.launch({ headless: true }) 
        const page = await browser.newPage();  
        var url = 'file://'+ctx.webRootPath.replace(/\\/g, '/');
        var html = generate_html(ctx);

        await page.goto(url);
        await page.setContent(html);

        var pdf = await page.pdf({
          format: 'A4',
          margin: {
                top: "20px",
                left: "20px",
                right: "20px",
                bottom: "20px"
          }   
        });
        await browser.close();        
        response.end(pdf);
      });
    } else {
      response.statusCode = 404;
      response.end();
    }
  }).listen(8989);



There are a few trick in the above code.
1) in order to process link element in the HTML file which refer to the  local file, we need to use page.goto(url), before calling page.setContent(html)

2) the root web location must be provided from client.(we may hard code this, but if it is in a war file, we cannot hard code the folder location statically).

3) in case of Java servlet, there is a method getServletContext().getRealPath("/") which provide this info. webRootPath in the above code has this value.

----

It we are only interested in generating pdf, there is way to use RUST library to do the same job.
 https://github.com/atroche/rust-headless-chrome

there are another similar tool but not restricted to chrome:https://github.com/jonhoo/fantoccini

Saturday, January 18, 2020

reverse proxy server in rust

There are several reverse proxy server.
Following crate seems easy to use.

https://docs.rs/hyper-reverse-proxy/0.4.0/hyper_reverse_proxy/

sozu is difficult to build in windows, and a bit old (2018).

if we use this, existing server app running in Glassfish can be moved to Rust based server without affecting the client side application. For instance, db related server side code can be migrated to Rust base server.

extern crate hyper;
extern crate futures;

use hyper::server::conn::AddrStream;
use hyper::{Body, Request, Server};
use hyper::service::{service_fn, make_service_fn};
use futures::future::{Future};

fn main() {

    // This is our socket address...
    let addr = ([127, 0, 0, 1], 13900).into();

    // A `Service` is needed for every connection.
    let make_svc = make_service_fn(|socket: &amp;AddrStream| {
        let remote_addr = socket.remote_addr();
        service_fn(move |req: Request<body>| { // returns BoxFut
            println!("path: {}, ip: {}", req.uri().path(), remote_addr.ip());
            return hyper_reverse_proxy::call(remote_addr.ip(), "http://127.0.0.1:8080", req)
        })
    });

    let server = Server::bind(&amp;addr)
        .serve(make_svc)
        .map_err(|e| eprintln!("server error: {}", e));

    println!("Running server on {:?}", addr);

    // Run this server for... forever!
    hyper::rt::run(server);
}






Wednesday, January 15, 2020

Parallel matrix computation in Rust

Parallel matrix computation is an interesting topic to test how much the zero cost abstraction is achieved in Rust.
Also if it can increase the performance, it is practically useful.

These should be used for Magnus expansion.

I will write such code later.


Multi threaded HTTP server using crossbeam

The Rust book has a section on multi threaded HTTP server, and I found the corresponding implementation on github.
https://github.com/richardknox/rust-webserver

This was actually published 2 years ago.
While the code is pretty neat, but it is intersting to implement this with crossbeam so that it can directly apply SPMC channel. basically this is typical situation for load balance server.

In fact, the straight forward modification did not work.
it seems better to use hyper.

there is an interesting github, that is quite close to my goal. but it is not using crossbeam.
So it would be better to use this for multi threaded http server imple with crossbeam.

https://github.com/hyperium/hyper/issues/1358

https://gist.github.com/klausi/93b57d3abe2c5bc4975b0b9921f2a3c2


Go program may be useful:
 https://github.com/dc0d/tcpserver/blob/master/tcpserver.go

------

After trying to combine crossbeam and tokio, I found this is not simple.
In particular the latest approach of Tokio rely on async, and async and thread seems not easy to coexist.
Once process function includes async code, all code must be async.
And thread's parameter closure should not be async, and the free variable are not permitted because of the lifetime of variable.
 there have been some discussion over mpsc channel against mpmc like approach, and if we use these Tokio, hyper library, we may not have to use mpmc at least for http related application.
since it already supports multi threaded handlers.

Anyway I will investigate hyper now. (hyper uses tokio)

https://tokio.rs/

https://hyper.rs/

https://github.com/hyperium/hyper


Tuesday, January 14, 2020

Why no traits for collections in Rust?

I tried to write a generic function which converts a list to a receiver channel.
Originally I wrote this function taking Vec, but I wanted to change to Collection type which is sort of 'super' type of Vec, but there is no such type in Rust.

There are interesting research in this direction in Rust.


https://www.reddit.com/r/rust/comments/83q25s/why_no_traits_for_collections_in_rust/

Why no traits for collections in Rust?

In Java, there are interfaces such as Collection and Map that are used to identify generic collections--e.g. ArrayList and HashSet implement Collection, and HashMap and TreeMap implement Map.
I'd like having this capability in Rust as well, in case I'm designing a library and want the user to be able to decide which implementation of a collection they'd like to use. However, I noticed Rust doesn't have traits to represent a generic collection. I was wondering if this was an intentional design decision and what the reasoning was.


http://smallcultfollowing.com/babysteps/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/


https://github.com/rust-lang/rfcs/pull/1598

Monday, January 13, 2020

Another sample code of SPMC (Single Producer Mutiple Consumer) In Rust

Following code are re-implementation of a simple Go channel program listed below.
It just reads strings  from input channel and convert it to struct data and send it to another channel, then when all the input string channel are processed,start the process of printing the data from the channel. ( this behavior is not so natural, but just a sample code to see how join will occur.)

It now possible to write pretty much similar code in Rust to the corresponding Go channel code using the  crossbeam library.

 https://stjepang.github.io/2017/08/13/designing-a-channel.html
 https://stjepang.github.io/2019/01/29/lock-free-rust-crossbeam-in-2019.html


Although this is a simple program, but it requires SPMC channel, the current Rust's std library only supports mpsc channel,  it becomes quite complicated if we try to simulate spmc using only mpsc channel. See several related articles found in web.

 https://medium.com/@polyglot_factotum/rust-concurrency-patterns-communicate-by-sharing-your-sender-11a496ce7791

The last part of Rust book is also describing similar thing in complicated approach.
btw, this Rust code define generic create_receiver function, but Go cannot since it has no generic.
 

In the end, Go allows to write a code causing data race easily.
While rustc guarantees the code has no data race issues.
In fact, from programming point of view, Rust is higher level programming language for concurrent problems than Go while its run-time performance is better than Go. 

Although this crossbeam is not part of std lib, but it will be in this year, 2020(?)
https://blog.yoshuawuyts.com/rust-2020/

the codes are available in https://github.com/calathus/channel-sample

Rust:

extern crate crossbeam;

use std::thread;
use crossbeam::crossbeam_channel::{Receiver, Sender, unbounded};
use crossbeam::sync::{WaitGroup};

const THREADS: usize = 4;

struct Info {
    n: i32,
    s: String,
}

fn create_receiver<T: Clone>(vec: Vec<T>) -> Receiver<T> {
    let (s, r) = unbounded();
    for e in vec.iter() {
        s.send(e.clone()).unwrap();
    }
    return r;
}

fn process_data(i: i32, ss_r: Receiver<String>, info_s: Sender<Info>) {
    for s in ss_r.iter() {
        info_s.send(Info{n: i, s: s}).unwrap();
    }
    println!("process_data {} done.", i);
}


fn main() {
    let vec = get_data();
    let ss_r = create_receiver(vec);
    let (info_s, info_r) = unbounded();

    thread::spawn(move || {
        let wg = WaitGroup::new();

        for i in 0..THREADS {
            let wg0 = wg.clone();
            let ss_r0 = ss_r.clone();
            let info_s0 = info_s.clone();

            thread::spawn(move || {
                process_data(i as i32, ss_r0, info_s0);
                drop(wg0);
            });
        }
        wg.wait();
        drop(info_s)
    });

    println!(">> start info printing.");
    for info in info_r.iter() {
        println!("n: {}, s: {}", info.n, info.s);
    }
    println!("done.");
}


fn get_data()-> Vec<String> {
    let mut v: Vec<String> = Vec::new();
    for i in 0..10000 {
        let s = format!("s{}", i);
        v.push(s);
    }
    return v;
}



GO:
package main

import (
    "fmt"
    "strings"
    "sync"
)

type Info struct {
    p int
    s string
}

func create_ssc() chan string {
    ssc := make(chan string)
    go func() {
        defer close(ssc)
        for _, s := range data() {
            ssc <- s
        }
    }()
    return ssc
}

func create_infoc() chan *Info {
    infos := make(chan *Info)
    return infos
}

func handle_ssc(n int, ssc chan string, infos chan *Info) {
    for s := range ssc {
        infos <- &Info{p: n, s: strings.ToUpper(s)}
    }
}

func main() {
    var width = 4

    ssc := create_ssc()
    infos := create_infoc()

    go func() {
        defer close(infos)
        var wg sync.WaitGroup
        wg.Add(width)

        for i := 0; i < width; i++ {
            go func(n int) {
                defer wg.Done()
                handle_ssc(n, ssc, infos)
            }(i)
        }
        wg.Wait()
    }()

    for i := range infos {
        fmt.Println(i)
    }
}

func data() []string {
    return []string{
        "aaa",
        "bb",
        "cc",
        "sss",
        "qq",
        "ww"}
}


Crossbeam: MPMC channel in Rust

I found Rust is not yet officially supporting Multi Producer Multi Consumer channel yet.
It only supports mpsc  lib i.e, Multi Producer Single Consumer channel.

This make difficult to write Go style channel program in Rust.
There has been quite intensive research activity to support MPMC recent years. see:

https://stjepang.github.io/2019/01/29/lock-free-rust-crossbeam-in-2019.html

Basically at the begging of 2019, crossbeam became available to fulfill this requirements.
Servo is already using it.
I don't know what happened year 2019 for this library, but the github is still active these days.



Sunday, January 12, 2020

AngularJS style web framework on Rust

I'm looking for a web framework which enable us to develop web application similar to AngularJS.
Although I don't like many aspect of AngularJS, which depends on too much runtime inspection/modification, it makes difficult to reason how it works. It is almost magical voodoo style programming.

But I like the clean separation of view/logic by HTML and model in JS.
While I looked at several web framework on Rust, all of them seems mixing presentation (HTML) in Rust.
Often HTML become very complicated, it is not good idea to include it in Rust programming code.
Of course, the action semantics must be written in Rust, but it should not include more than that.

Yew seems closest to my idea, but it seems relying on Html! macro.
action are mixed in the HTML description.

Rocket seems more server based approach. So some of server side application I may use it, but there will be simpler library for that purpose.( I need to investigate later)
Also the development seems almost stopped 1 year ago, a lot of samples are too old. This is bad sign.

--

So my plan is to investigate other framework which is closer to my ideal.
And I may develop missing part using wasm-bindgen, yew, servo.

For instance,
1) we will write HTML which include directives, and parse it and generate another normal HTML( no special directives) as well as corresponding event handling codes.
nasty part of code generation is it is difficult to synchronize modified code and generated code.
So ideally generated code and hand coded part should be co-existed.
2) in order to implement such code generator, we may use HTML parser html5ever of servo.
3)  then we will write a transformer of HTML node into normal HTML  node which also generated associated action logic Rust code.
4) It might be simpler to use wasm-bindgen for this part rather than mapping to yew element model.
Since it will duplicate node structure in browser and rust.

Anyway, I need to check these more.
Probably there will be this type of project, just I 'm not aware of.





Rust channel vs Go channel

There is an interesting article demonstrating how to write channel using Rust:

Multithreading in Rust with MPSC (Multi-Producer, Single Consumer) channels

I rewrote this code in Go lang to see the coding style difference and the runtime performance.

Go channel is mixing the notion of sender/receiver into a single channel. the same channel is used to send the message(data), and receiving the message.
Also termination condition are using special library wg.  

I definitely like the clean data race free Rust approach.

For the performance, as expected, Rust is faster than GO. but not so significantly.
But this is a rather simple case, other kind of processes which allocate from heap in Go, there might be more significant difference.

for DIFICULTY is '0000000', GO: 12 sec Rust: 9.7 sec

GO sample:
BASE: 42, THREADS 8, DIFFICULTY: 0000000
&{7 50443823 411e3c717da473d023d6c5aa11d330ffed3fd4c641bd75eafcc779b5e0000000}

[Done] exited with code=0 in 11.982 seconds
RUST sample:
    Finished release [optimized] target(s) in 0.03s
     Running `target\release\mpsc-crypto-mining.exe`
Attempting to find a number, which - while multiplied by 42 and hashed using SHA-256 - will result in a hash ending with 0000000.
Please wait...
Found the solution.
The number is: 50443823.
Result hash: 411e3c717da473d023d6c5aa11d330ffed3fd4c641bd75eafcc779b5e0000000.

real    0m9.661s

-----

for  DIFICULTY is '00000004':GO: 1551 sec Rust: 1444 sec

GO sample:
BASE: 42, THREADS 8, DIFFICULTY: 00000004
&{0 6829102344 aa60ef885f7d41903661d03a55aca85ae195fdb63bb1b4cbc03e804d00000004}

[Done] exited with code=0 in 1551.796 seconds
RUST sample:
$ time cargo run --release
   Compiling mpsc-crypto-mining v0.1.0 (C:\Users\nnaka\rust_projects\mpsc-crypto-mining)
    Finished release [optimized] target(s) in 1.34s
     Running `target\release\mpsc-crypto-mining.exe`
Attempting to find a number, which - while multiplied by 42 and hashed using SHA-256 - will result in a hash ending with 00000004.
Please wait...
Found the solution.
The number is: 6829102344.
Result hash: aa60ef885f7d41903661d03a55aca85ae195fdb63bb1b4cbc03e804d00000004.

real    24m4.707s

----

Go code:

package main

import (
 "crypto/sha1"
 "encoding/hex"
 "fmt"
 "strconv"
 "strings"
 "sync"
)

var BASE = 42
var THREADS = 8
var DIFFICULTY = "0000000" // 7 digit

func compute_sha256(number int) string {
 s := strconv.Itoa(number * BASE)
 h := sha256.New()
 h.Write([]byte(s))
 return fmt.Sprintf("%x", h.Sum(nil))
}

type Solution struct {
 start_at int
 x        int
 hash     string
}

func create_solution() chan *Solution {
 solution := make(chan *Solution)
 return solution
}

func verify_number(start_at int, number int) *Solution {
 hash := compute_sha256(number)
 if strings.HasSuffix(hash, DIFFICULTY) {
  return &Solution{start_at: start_at, x: number, hash: hash}
 } else {
  return nil
 }
}

func search_for_solution(start_at int, sender chan *Solution) {
 for number := start_at; ; number += THREADS {
  var solution = verify_number(start_at, number)
  if solution != nil {
   sender < - solution
   return
  }
 }
}

func main() {
 sender := create_solution()
 go func() {
  defer close(sender)
  var wg sync.WaitGroup
  wg.Add(1)

  for i := 0; i < THREADS; i++ {
   go func(start_at int) {
    defer wg.Done()
    search_for_solution(start_at, sender)
   }(i)
  }
  wg.Wait()
 }()

 for i := range sender { // infos_reveiver.recv()
  fmt.Printf("BASE: %d, THREADS %d, DIFFICULTY: %s\n", BASE, THREADS, DIFFICULTY)
  fmt.Println(i)
 }
}

Test Machine:

I used my laptop I purchced recently: Lenovo Flex 14 2-in-1 Convertible Laptop, 14 Inch FHD Touchscreen Display, AMD Ryzen 5 3500U Processor, 12GB DDR4 RAM, 256GB NVMe SSD, Windows 10.( i did not know AMD will release Ryzen 4000H, but this may not be so bad buy.)

This test was also useful to know the performance of this CPU. When we run this with 8 threads, it utilize all CPU threads. when we run 4 threads version, it uses mainly 4 threads. and it takes 15 sec vs 12 (for 8 threads). Although actual core number  is 4, 8 threads version is faster than 4 threads version. but not twice faster..


BASE: 42, THREADS 4, DIFFICULTY: 0000000
&{3 50443823 411e3c717da473d023d6c5aa11d330ffed3fd4c641bd75eafcc779b5e0000000}

[Done] exited with code=0 in 15.137 seconds

And when running full threads, the frequency is boosted to 2.8 GHz. not so fast..





So if we use Threadripper 3990X, this may be done 16X(4/3)=21 times faster.
I wonder which version, i.e,  GPU accelerated version vs multicore version will provide faster result.


Wednesday, January 8, 2020

how to update node to v12 in ubuntu

it iis hard to update node to the latest version in ubuntu, even if we added ppa for v8, it will not reflect to v8 from v6.
The following article worked:

use nvm


https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-16-04


  • sudo apt-get update
  • sudo apt-get install build-essential libssl-dev
Once the prerequisite packages are installed, you can pull down the nvm installation script from the project’s GitHub page. The version number may be different, but in general, you can download it with curl:
  • curl -sL https://raw.githubusercontent.com/creationix/nvm/v0.33.8/install.sh -o install_nvm.sh
And inspect the installation script with nano:
  • nano install_nvm.sh
Run the script with bash:
  • bash install_nvm.sh
It will install the software into a subdirectory of your home directory at ~/.nvm. It will also add the necessary lines to your ~/.profile file to use the file.
To gain access to the nvm functionality, you’ll need to log out and log back in again, or you can source the ~/.profile file so that your current session knows about the changes:
  • source ~/.profile
Now that you have nvm installed, you can install isolated Node.js versions.
To find out the versions of Node.js that are available for installation, you can type:
  • nvm ls-remote

Yew : Rust / Wasm client web app framework

https://github.com/yewstack/yew

Yew is a modern Rust framework inspired by Elm and React for creating multi-threaded frontend apps with WebAssembly.

The framework supports multi-threading & concurrency out of the box. It uses Web Workers API to spawn actors (agents) in separate threads and uses a local scheduler attached to a thread for concurrent tasks.


-----

this is the framework I was looking for.

Rocket was not good framework. it may be used for server side , service development. but then it should not be called as web framework.
Also its development is stop  1 year ago. and many of library became too old. So we should not use such library.


--
in order to run test, we need to use node greater than v8(v8 is ok). the test use async new syntax.
old version does not support it (without option flag).
v12 is better to use now.

Recursive Matrix and the parallel matrix multiplication using crossbeam and generic constant

This was planned project I posted before. Basically in order to evaluate Rust's claim for zero cost abstraction and the effectiveness o...