Machine Learning TV
Machine Learning TV
  • Видео 137
  • Просмотров 1 843 677
Limitations of the ChatGPT and LLMs - Part 3
If you haven't watched the Part 1 and Part 2, I highly suggest watching them before watching the Part 3.
Large Language Models (LLMs) have shown a huge potential and recently the have drawn much attention. In this presentation, Ameet Deshpande and Alexander Wettig gives a detailed explanation about how Large Language Models and ChatGPT works. He makes clear that he does not assume that the audience has any prior knowledge about language models. He starts with embedding and give an explanation about Transformers as well. This is the last episode of this amazing serie. Thanks for watching.
Просмотров: 676

Видео

Understanding ChatGPT and LLMs from Scratch - Part 2
Просмотров 863Год назад
Large Language Models (LLMs) have shown a huge potential and recently the have drawn much attention. In this presentation, Ameet Deshpande and Alexander Wettig gives a detailed explanation about how Large Language Models and ChatGPT works. He makes clear that he does not assume that the audience has any prior knowledge about language models. He starts with embedding and give an explanation abou...
Understanding ChatGPT and LLMs from Scratch - Part 1
Просмотров 3,4 тыс.Год назад
Large Language Models (LLMs) have shown a huge potential and recently the have drawn much attention. In this presentation, Ameet Deshpande and Alexander Wettig gives a detailed explanation about how Large Language Models and ChatGPT works. He makes clear that he does not assume that the audience has any prior knowledge about language models. He starts with embedding and give an explanation abou...
Understanding BERT Embeddings and How to Generate them in SageMaker
Просмотров 4,5 тыс.Год назад
Course link: www.coursera.org/learn/ml-pipelines-bert In this course, you will use BERT for the same purpose. Before diving into the BERT algorithm, I will highlight a few differences between BlazingText and BERT at a very high level. As you can see here, BlazingText is based on Word2Vec, whereas BERT is based on transformer architecture. Both BlazingText and BERT generate word embeddings. Howe...
Understanding Coordinate Descent
Просмотров 7 тыс.Год назад
Course link: www.coursera.org/learn/ml-regression let's just have a little aside on the coordinate decent algorithm, and then we're gonna describe how to apply coordinate descent to solving our lasso objective. So, our goal here is to minimize sub function g. So, this is the same objective that we have whether we are talking about our closed form solution, gradient descent, or this coordinate d...
Bootstrap and Monte Carlo Methods
Просмотров 7 тыс.Год назад
Here we look at the two main concepts that are behind this revolution, the Monte Carlo method and the bootstrap. We will discuss the main principles behind these methods and then see how to apply them in various important contexts, such as in regression and for constructing confidence intervals. Course link: www.coursera.org/learn/stanford-statistics/
Maximum Likelihood as Minimizing KL Divergence
Просмотров 2,8 тыс.2 года назад
While the Bayes' formula for the posterior probability or for parameters given the data is very general, there are some interesting special cases where that can be analyzed separately. Let's look at them in a sequence. The first special case arises when the model is a fixed one and for all. In this case, we can drop the conditioning on M in this formula. The Bayesian evidence, in this case, is ...
Understanding The Shapley Value
Просмотров 14 тыс.2 года назад
Shapley Value is one of the most prominent ways of dividing up the value of a society, the productive value of some, set of individuals among its members. The Shapley Value is, is based on Lloyd Shapley's idea that members should basically be receiving things which are proportional to their marginal contributions. So, basically we look at what, what does a person add when we add them to a group...
Kalman Filter - Part 2
Просмотров 26 тыс.2 года назад
Course Link: www.coursera.org/learn/state-estimation-localization-self-driving-cars Let's consider our Kalman Filter from the previous lesson and use it to estimate the position of our autonomous car. If we have some way of knowing the true position of the vehicle, for example, an oracle tells us, we can then use this to record a position error of our filter at each time step k. Since we're dea...
Kalman Filter - Part 1
Просмотров 97 тыс.3 года назад
This course will introduce you to the different sensors and how we can use them for state estimation and localization in a self-driving car. By the end of this course, you will be able to: - Understand the key methods for parameter and state estimation used for autonomous driving, such as the method of least-squares - Develop a model for typical vehicle localization sensors, including GPS and I...
Recurrent Neural Networks (RNNs) and Vanishing Gradients
Просмотров 8 тыс.3 года назад
For one, the way plain or vanilla RNN model sequences by recalling information from the immediate past, allows you to capture dependencies to a certain degree, at least. They're also relatively lightweight compared to other n-gram models, taking up less RAM and space. But there are downsides, the RNNs architecture optimized for recalling the immediate past causes it to struggle with longer sequ...
Transformers vs Recurrent Neural Networks (RNN)!
Просмотров 21 тыс.3 года назад
Course link: www.coursera.org/learn/attention-models-in-nlp/lecture/glNgT/transformers-vs-rnns Using an RNN, you have to take sequential steps to encode your input, and you start from the beginning of your input making computations at every step until you reach the end. At that point, you decode the information following a similar sequential procedure. As you can see here, you have to go throug...
Language Model Evaluation and Perplexity
Просмотров 18 тыс.3 года назад
Course Link: www.coursera.org/lecture/probabilistic-models-in-nlp/language-model-evaluation-SEO4T Transcript: In this video I'll show you how to evaluate a language model. The metric for this is called perplexity and I will explain what this is. First, you'll divide the text corpus into train validation and test data, then you will dive into the concepts of perplexity an important metric used t...
Common Patterns in Time Series: Seasonality, Trend and Autocorrelation
Просмотров 8 тыс.4 года назад
Course link: www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction Time-series come in all shapes and sizes, but there are a number of very common patterns. So it's useful to recognize them when you see them. For the next few minutes we'll take a look at some examples. The first is trend, where time series have a specific direction that they're moving in. As you can see from th...
Limitations of Graph Neural Networks (Stanford University)
Просмотров 14 тыс.4 года назад
Limitations of Graph Neural Networks (Stanford University)
Understanding Metropolis-Hastings algorithm
Просмотров 69 тыс.4 года назад
Understanding Metropolis-Hastings algorithm
Learning to learn: An Introduction to Meta Learning
Просмотров 27 тыс.4 года назад
Learning to learn: An Introduction to Meta Learning
Page Ranking: Web as a Graph (Stanford University 2019)
Просмотров 3,4 тыс.4 года назад
Page Ranking: Web as a Graph (Stanford University 2019)
Deep Graph Generative Models (Stanford University - 2019)
Просмотров 19 тыс.4 года назад
Deep Graph Generative Models (Stanford University - 2019)
Graph Node Embedding Algorithms (Stanford - Fall 2019)
Просмотров 67 тыс.4 года назад
Graph Node Embedding Algorithms (Stanford - Fall 2019)
Graph Representation Learning (Stanford university)
Просмотров 94 тыс.4 года назад
Graph Representation Learning (Stanford university)
Understanding Word Embeddings
Просмотров 10 тыс.4 года назад
Understanding Word Embeddings
Variational Autoencoders - Part 2 ( Modeling a Distribution of Images )
Просмотров 1,6 тыс.4 года назад
Variational Autoencoders - Part 2 ( Modeling a Distribution of Images )
Variational Autoencoders - Part 1 (Scaling Variational Inference & Unbiased estimates)
Просмотров 2,9 тыс.4 года назад
Variational Autoencoders - Part 1 (Scaling Variational Inference & Unbiased estimates)
DBSCAN: Part 2
Просмотров 21 тыс.5 лет назад
DBSCAN: Part 2
DBSCAN: Part 1
Просмотров 29 тыс.5 лет назад
DBSCAN: Part 1
Gaussian Mixture Models for Clustering
Просмотров 90 тыс.5 лет назад
Gaussian Mixture Models for Clustering
Understanding Irreducible Error and Bias (By Emily Fox)
Просмотров 7 тыс.5 лет назад
Understanding Irreducible Error and Bias (By Emily Fox)
Python Libraries for Machine Learning You Must Know!
Просмотров 1,9 тыс.5 лет назад
Python Libraries for Machine Learning You Must Know!
Conditional Probability
Просмотров 1,4 тыс.5 лет назад
Conditional Probability

Комментарии

  • @homeycheese1
    @homeycheese1 8 дней назад

    will coordinate descent always converge using LASSO even if the ratio of number of features to number of observations/samples is large?

  • @muhammadaneeqasif572
    @muhammadaneeqasif572 18 дней назад

    amazing great to see some good content again thank yt algorithm keep it up

  • @stewpatterson1369
    @stewpatterson1369 21 день назад

    best video i've seen on this. great visuals & explanation

  • @pnachtwey
    @pnachtwey 22 дня назад

    This works ok on nice functions like g(x,y)=x^2+y^2 but real data often looks more like Grand Canyon where the path is very narrow and very windy.

  • @sELFhATINGiNDIAN
    @sELFhATINGiNDIAN Месяц назад

    No

  • @kacpersarnowski7969
    @kacpersarnowski7969 Месяц назад

    Great video, you are the best :)

  • @frielruambil6275
    @frielruambil6275 Месяц назад

    Thanks very much, I was looking for such videos to answer my assignment questions and you answered all of them at once within 3 minutes. I salute you,please keep on do more videos to assist the students to pass their exams and assignments.

  • @NeverHadMakingsOfAVarsityAthle
    @NeverHadMakingsOfAVarsityAthle 2 месяца назад

    Hey! Thanks for the fantastic content :) I'm trying to understand the additivity axiom a bit better. Is this axiom the main reason why Shapley values for machine learning forecast can just be added up for one feature over many different predictions? Let's say we can have predictions for two different days in a time series and each time we calculate the shapley value for the price value. Does the additivity axiom then imply that I can add up the Shapley values for price for these two predictions (assuming they are independent) to make a statement about the importance of price over multiple predictions?

  • @somerset006
    @somerset006 3 месяца назад

    What about self-driving rockets?

  • @paaabl0.
    @paaabl0. 4 месяца назад

    Shapley values are great, but not gonna help you much with complex non-linear patterns, especially in terms of global feature importance

  • @williamstorey5024
    @williamstorey5024 4 месяца назад

    what is text regression?

  • @yandajiang1744
    @yandajiang1744 5 месяцев назад

    Awesome explanation

  • @user-vh9de5dy9q
    @user-vh9de5dy9q 5 месяцев назад

    Why are the given weights for the distributions, are not really showcasing the distributions on the graph. I mean i would choose π1 = 45, π2 = 35, π3 = 20

  • @thechannelwithoutanyconten6364
    @thechannelwithoutanyconten6364 5 месяцев назад

    Two things: 1. What the H matrix is has not been described. 2. One non s1x1 matrix cannot be smaller or greater then another. This is sloppy. Besides that, it is a great work.

  • @obensustam3574
    @obensustam3574 5 месяцев назад

    I wish there was a Part 3 :(

  • @DenguBoom
    @DenguBoom 5 месяцев назад

    Hi, about the sample has X1 to Xn, do X1 and Xn have to be different? Because you have a previous sample of 100 height from 100 different people. Or it can be like we treated in bootstrap that X1* to Xn* can be drawn randomly from X1 to Xn so basically can draw same height of a single person?

  • @feriyonika7078
    @feriyonika7078 6 месяцев назад

    Thanks, I can more understand about KF.

  • @usurper1091
    @usurper1091 6 месяцев назад

    7:10

  • @lingfengzhang2943
    @lingfengzhang2943 7 месяцев назад

    Thanks! It's very clear

  • @user-uk2rv4kt8d
    @user-uk2rv4kt8d 7 месяцев назад

    very good video. perfect explaination!

  • @sadeghmirzaei9330
    @sadeghmirzaei9330 7 месяцев назад

    Thank you so much for your explanation.🎉

  • @laitinenpp
    @laitinenpp 7 месяцев назад

    Great job, thank you!

  • @SCramah13
    @SCramah13 8 месяцев назад

    Clean explanation. Thank you very much...cheers~

  • @felipela2227
    @felipela2227 8 месяцев назад

    Your explanation was great, thx

  • @vambire02
    @vambire02 8 месяцев назад

    Disappointed ☹️ no part 3

  • @Commonsenseisrare
    @Commonsenseisrare 9 месяцев назад

    Amazing lecture of gnns.

  • @cmobarry
    @cmobarry 10 месяцев назад

    I like your term "Word Algebra". It might be unintended side effect but I have been pondering it for years!

  • @rakr6635
    @rakr6635 10 месяцев назад

    no part 3, sad 😥

  • @vgreddysaragada
    @vgreddysaragada 10 месяцев назад

    Great work..

  • @boussouarsari4482
    @boussouarsari4482 11 месяцев назад

    I believe there might be an issue with the perplexity formula. How can we refer to 'w' as the test set containing 'm' sentences, denoting 'm' as the number of sentences, and then immediately after state that 'm' represents the number of all words in the entire test set? This description lacks clarity and coherence. Could you please clarify this part to make it more understandable?

  • @GrafBazooka
    @GrafBazooka 11 месяцев назад

    i cant concentrate she is too hot 🤔😰

  • @sunnelyeh
    @sunnelyeh 11 месяцев назад

    this video represent meaning that F/A 18 has capability locked UFO!

  • @thefantasticman
    @thefantasticman 11 месяцев назад

    hard to foucus on ppt can any one explain me why ?

  • @nunaworship
    @nunaworship 11 месяцев назад

    Can you please share the link for the books you recommended!

  • @AoibhinnMcCarthy
    @AoibhinnMcCarthy Год назад

    Hard to follow not concise.

  • @jcorona4755
    @jcorona4755 Год назад

    Pagan porque vean que tiene más seguidores. De echo pagas $10 pesos por cada video

  • @g-code9821
    @g-code9821 Год назад

    Isn't the positional encoding done with the sinusoidal function?

  • @homataha5626
    @homataha5626 Год назад

    Hello, Thank you for sharing. Do you have the code repositiry? I only learn after I implemented it.

  • @because2022
    @because2022 Год назад

    Great content.

  • @robinranabhat3125
    @robinranabhat3125 Год назад

    Anyone. at 31:25, shouldn't the final equation at bottom-right be about minimizing the loss. think that's a typo.

  • @Karl_with_a_K
    @Karl_with_a_K Год назад

    I have run into token exhaustion while working with GPT4 specifically when it is giving programming language output. Im assuming resolving this will be a component of GPT5...

  • @yifan1342
    @yifan1342 Год назад

    sound quality is terrible

    • @nehalkalita
      @nehalkalita 10 месяцев назад

      Turning on subtitles can be helpful to some extent.

  • @majidafra
    @majidafra Год назад

    I deeply envy those who have been in your NN & DL class.

  • @josephzhu5129
    @josephzhu5129 Год назад

    Great lecture, he knows how to explain complicated ideas, thanks a lot!

  • @chris-dx6oh
    @chris-dx6oh Год назад

    Great video

  • @ssvl2204
    @ssvl2204 Год назад

    Very nice and conscise presentation, thanks!

  • @zhaobryan4441
    @zhaobryan4441 Год назад

    super super clear!

  • @lara6893
    @lara6893 Год назад

    Emily and Carlos rock, heck yeah!!

  • @StratosFair
    @StratosFair Год назад

    Great video ! Are you guys planning to upload follow up lectures on this topic ?

  • @StratosFair
    @StratosFair Год назад

    Where is the video on recursive least squares though ?