With this one get the hybrid attention model treatment?

#7
by TomLucidor - opened

There are a lot of linear attention models, but not that many do reasoning, could this one convert some of the regular attention layers into linear layers?

Sign up or log in to comment